How can I read the contents (separate paragraphs) of a .doc /.docx file in C#

Very easy...
Project->Add Reference->COM
tab and look for the Microsoft Word X.x Object Library
Use ApplicationClass class. Enjoy...

I have added the reference, but could you be more elaborative with the ApplicaitonClass usage?

No, I need you to try using it, I could tell you some code and your problem been solved! but I need you to try to learn. Please try

Is there a way I could run it in background. i.e. without opening the word file.

Have manage to read the contents but it one can certainly notice the opening and closing of the world doc

Microsoft.Office.Interop.Word.Application wordApp = new Microsoft.Office.Interop.Word.ApplicationClass();

object fileNameO = fileName;//"c:\\sample.doc";
object objFalse = false;
object objTrue = true;
object missing = System.Reflection.Missing.Value;
object emptyData = string.Empty;

try

{
Microsoft.Office.Interop.Word.Document aDoc = wordApp.Documents.Open(ref fileNameO, ref objFalse, ref objTrue,
ref objFalse, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue,
ref missing, ref missing, ref missing, ref missing);


aDoc.ActiveWindow.Selection.WholeStory();
aDoc.ActiveWindow.Selection.Copy();



IDataObject data = System.Windows.Forms.Clipboard.GetDataObject();
String filetext = data.GetData(System.Windows.Forms.DataFormats.Text).ToString();
System.Windows.Forms.Clipboard.SetDataObject(string.Empty);
richTextBox1.Text = filetext;
}
catch (Exception err)
{
MessageBox.Show(err.Message);
}

finally

{
wordApp.Documents.Close(ref missing, ref missing, ref missing);
wordApp.Application.Quit(ref missing, ref missing, ref missing);
}
}

No, I need you to try using it, I could tell you some code and your problem been solved! but I need you to try to learn. Please try

Edited 3 Years Ago by happygeek: fixed formatting

Look .doc file is binary file I think you can't do that without word, but .docx is follow OOXML you can play with it.

Hello,
read the contents (separate paragraphs) of a .doc /.docx file in C#.
there is two way to do this, first way is use office Interop, The office must be installed on the machine, Microsoft does not recommend using Office application in server-side scenarios.
Second way to use 3 party component, I used Spire.Doc.

This article has been dead for over six months. Start a new discussion instead.