954,514 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Reading word file

How can I read the contents (separate paragraphs) of a .doc /.docx file in C#

gallian99
Junior Poster in Training
77 posts since Jan 2009
Reputation Points: 10
Solved Threads: 0
 

Very easy...
Project->Add Reference->COM
tab and look for the Microsoft Word X.x Object Library
Use ApplicationClass class. Enjoy...

Ramy Mahrous
Postaholic
2,196 posts since Aug 2006
Reputation Points: 480
Solved Threads: 276
 

I have added the reference, but could you be more elaborative with the ApplicaitonClass usage?

gallian99
Junior Poster in Training
77 posts since Jan 2009
Reputation Points: 10
Solved Threads: 0
 

No, I need you to try using it, I could tell you some code and your problem been solved! but I need you to try to learn. Please try

Ramy Mahrous
Postaholic
2,196 posts since Aug 2006
Reputation Points: 480
Solved Threads: 276
 

Is there a way I could run it in background. i.e. without opening the word file.

Have manage to read the contents but it one can certainly notice the opening and closing of the world doc

Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.ApplicationClass();

object fileNameO = fileName;//"c:\\sample.doc";
object objFalse = false;
object objTrue = true;
object missing = System.Reflection.Missing.Value;
object emptyData = string.Empty;

try
{
Microsoft.Office.Interop.Word.Document aDoc = wordApp.Documents.Open(ref fileNameO, ref objFalse, ref objTrue,
ref objFalse, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue,
ref missing, ref missing, ref missing, ref missing);

aDoc.ActiveWindow.Selection.WholeStory();
aDoc.ActiveWindow.Selection.Copy();


IDataObject data = System.Windows.Forms.Clipboard.GetDataObject();
String filetext = data.GetData(System.Windows.Forms.DataFormats.Text).ToString();
System.Windows.Forms.Clipboard.SetDataObject(string.Empty);
richTextBox1.Text = filetext;
}
catch (Exception err)
{
MessageBox.Show(err.Message);
}
finally
{
wordApp.Documents.Close(ref missing, ref missing, ref missing);
wordApp.Application.Quit(ref missing, ref missing, ref missing);
}
}

No, I need you to try using it, I could tell you some code and your problem been solved! but I need you to try to learn. Please try
gallian99
Junior Poster in Training
77 posts since Jan 2009
Reputation Points: 10
Solved Threads: 0
 

Look .doc file is binary file I think you can't do that without word, but .docx is follow OOXML you can play with it.

Ramy Mahrous
Postaholic
2,196 posts since Aug 2006
Reputation Points: 480
Solved Threads: 276
 

Hello,
read the contents (separate paragraphs) of a .doc /.docx file in C#.
there is two way to do this, first way is use office Interop, The office must be installed on the machine, Microsoft does not recommend using Office application in server-side scenarios.
Second way to use 3 party component, I used Spire.Doc .

maryjok3698
Newbie Poster
2 posts since Jul 2010
Reputation Points: 10
Solved Threads: 0
 

Was it really necessary to resurrect a thread over a year and a half old just to pitch a .net component?

Ryshad
Nearly a Posting Virtuoso
1,307 posts since Aug 2009
Reputation Points: 512
Solved Threads: 246
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You