I had to do something similar about a week ago and did something very similar what Diamonddrake suggested. However this method looses all the formatting of the text that was on the doc file.
Is there any way you can preserve that formatting when its copied to the RichTextBox? If you manually select a text, copy and paste, the text pasted on the RichTextBox usually retains the formatting...
I think it would be something to do with the GetData(DataFormats.???) but not sure.
jatin24
Junior Poster in Training
75 posts since Aug 2009
Reputation Points: 31
Solved Threads: 21
you can read the contents of any file to a text box with the string reader class
using (System.IO.StreamReader sr = new System.IO.StreamReader("TestFile.txt"))
{
RichTextBox1.Text = sr.ReadToEnd();
}
but .doc files will not load as plain text because they are a special format created by Microsoft word. But if you have Microsoft word installed on the computer you are using your C# app on you can load up the Microsoft Word Object Library by adding it into the reference and create an object for reading .doc files.
but the streamreader will read most text formats.
Is this true? Look at this picture:
[img]http://www.file.si/files/g0f81hvm4fua5fiad6h1.jpg[/img]
Here doesn`t look that I can get any text out of .doc file.
Mitja Bonca
Nearly a Posting Maven
2,485 posts since May 2009
Reputation Points: 641
Solved Threads: 474
as mentioned before, .doc files are ZIP compressed xml files. so if you read the data of a .doc file using the code I posted it will not show you the text it will show you the result of the binary compression expressed as ascii characters.
Sorry. the point of all my posts was to explain that word is a special format that requires an office interlop to read.
That means, as you said in one of your post, that I need a new reference of "Microsoft Word Object Library", right? http://www.vbforums.com/showpost.php?p=3114899&postcount=8 - this one?!
I was trying to do that refernece, but I got an error on Word refernece - a yellow exclamation mark. As I read this happens if I don`t have the SP3 installed, or someting. Right?
EDIT: I got the file which salves that problem. Word reference is no long in a yellow exclamation mark. So how do I got that reserved word "Word", with this: "using Microsoft.Office.Interop.Word;" ? With this I only got "Words".
Mitja Bonca
Nearly a Posting Maven
2,485 posts since May 2009
Reputation Points: 641
Solved Threads: 474
Where do you get:
Word.ApplicationClass wordApp = new Word.ApplicationClass();
I can only have:
ApplicationClass wordApp = new Word.ApplicationClass();
Mitja Bonca
Nearly a Posting Maven
2,485 posts since May 2009
Reputation Points: 641
Solved Threads: 474
object file = myFullPath;
object nullobj = System.Reflection.Missing.Value;
ApplicationClass wordApp = new ApplicationClass();
Document doc = wordApp.Documents.Open(
ref file, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj,
ref nullobj, ref nullobj, ref nullobj);
doc.ActiveWindow.Selection.WholeStory();
doc.ActiveWindow.Selection.Copy();
IDataObject data = Clipboard.GetDataObject();
String myGetString = data.GetData(DataFormats.Text).ToString();
//doc.Close();
I would like to know what are those ref file, ref nullobj for?
Mitja Bonca
Nearly a Posting Maven
2,485 posts since May 2009
Reputation Points: 641
Solved Threads: 474
If nothing else rename your document to a .zip, open it, extract the xml files, and you can extract the information with the XML classes. However using the MSWord references should be much easier...
I did manage to work it out. It works great. Thx guys.
Mitja Bonca
Nearly a Posting Maven
2,485 posts since May 2009
Reputation Points: 641
Solved Threads: 474