Hi guys,
How to load any other document formats other than .txt , and .rtf (which naturally supported by LoadFile() function), like .java, .c# , and .doc into richTextBox object?????

any help appreciated. Thanks in advance.

you can read the contents of any file to a text box with the string reader class

using (System.IO.StreamReader sr = new System.IO.StreamReader("TestFile.txt"))
            {
                RichTextBox1.Text =  sr.ReadToEnd();
            }

but .doc files will not load as plain text because they are a special format created by Microsoft word. But if you have Microsoft word installed on the computer you are using your C# app on you can load up the Microsoft Word Object Library by adding it into the reference and create an object for reading .doc files.

but the streamreader will read most text formats.

Thanks for your reply, but I think there's no need to use the StreamReader class here because the richTextBox.LoadFile() support this inherently.

This method I think it won't load .doc file because it may contain some object and it won't read it correctly.

So it still unsolved.

Thanks again

I don't think you caught most of my post. the Stream reader class will let you load text that loadfile won't.

.DOC is a proprietary format!
this means to edit it you need access to the original software that it belongs to. microsoft allows you to interlop methods from its office suite in other applications.

I.E. If you have Microsoft office installed on the computer you run your c# application on, then you can open .doc files, Otherwise you cannot!

assuming you do have Microsoft office installed on your developer machine must first add a reference to Microsoft Word Object Library then just use the code

Word.ApplicationClass wordApp=new ApplicationClass();
//Word.ApplicationClass is to access the word application

object file=path;

object nullobj=System.Reflection.Missing.Value;  

Word.Document doc = wordApp.Documents.Open(

ref file, ref nullobj, ref nullobj,

                                      ref nullobj, ref nullobj, ref nullobj,

                                      ref nullobj, ref nullobj, ref nullobj,

                                      ref nullobj, ref nullobj, ref nullobj);

doc.ActiveWindow.Selection.WholeStory();

doc.ActiveWindow.Selection.Copy();

IDataObject data=Clipboard.GetDataObject();

txtFileContent.Text=data.GetData(DataFormats.Text).ToString();

doc.Close();

NOTE: .doc files are a zip compressed special formatted XML file, but getting it right manually would be virtually impossible because there are possible hundreds of special formatting commands. So even though technically you could decompress and parse the document files manually, you would spend months getting it to work right. Better to just target machines with MS Word

I had to do something similar about a week ago and did something very similar what Diamonddrake suggested. However this method looses all the formatting of the text that was on the doc file.

Is there any way you can preserve that formatting when its copied to the RichTextBox? If you manually select a text, copy and paste, the text pasted on the RichTextBox usually retains the formatting...
I think it would be something to do with the GetData(DataFormats.???) but not sure.

Hi guys,
How to load any other document formats other than .txt , and .rtf (which naturally supported by LoadFile() function), like .java, .c# , and .doc into richTextBox object?????

any help appreciated. Thanks in advance.

Well you can use this on button click event:

OpenFileDialog f = new OpenFileDialog();
            f.Title = "open file as..";
            f.Filter = "Doc Files|*.doc|Java Files|*.java|C# Files|*.cs|All Files|*.*"; // and in a similar way you can load any format here.......
            DialogResult dr = f.ShowDialog();
            if (dr == DialogResult.OK)
            {
                s1 = f.FileName;
                richTextBox1.LoadFile(s1);
                open=true;
            }

By the help of this you can load any other doument format files as well...............

Well you can use this on button click event:

OpenFileDialog f = new OpenFileDialog();
            f.Title = "open file as..";
            f.Filter = "Doc Files|*.doc|Java Files|*.java|C# Files|*.cs|All Files|*.*"; // and in a similar way you can load any format here.......
            DialogResult dr = f.ShowDialog();
            if (dr == DialogResult.OK)
            {
                s1 = f.FileName;
                richTextBox1.LoadFile(s1);
                open=true;
            }

By the help of this you can load any other doument format files as well...............

This will work for .cs files and other native text files, but .doc is a proprietary format. It WILL NOT WORK for microsoft word documents. (if you just name a text file file.doc it doesn't not make it a .doc file. it just appears that way, true .doc are zipped special XML files and the ritchtextbox class will not parse it.)

you can read the contents of any file to a text box with the string reader class

using (System.IO.StreamReader sr = new System.IO.StreamReader("TestFile.txt"))
            {
                RichTextBox1.Text =  sr.ReadToEnd();
            }

but .doc files will not load as plain text because they are a special format created by Microsoft word. But if you have Microsoft word installed on the computer you are using your C# app on you can load up the Microsoft Word Object Library by adding it into the reference and create an object for reading .doc files.

but the streamreader will read most text formats.

Is this true? Look at this picture:
[img]http://www.file.si/files/g0f81hvm4fua5fiad6h1.jpg[/img]

Here doesn`t look that I can get any text out of .doc file.

Edited 7 Years Ago by Mitja Bonca: n/a

as mentioned before, .doc files are ZIP compressed xml files. so if you read the data of a .doc file using the code I posted it will not show you the text it will show you the result of the binary compression expressed as ascii characters.

Sorry. the point of all my posts was to explain that word is a special format that requires an office interlop to read.

as mentioned before, .doc files are ZIP compressed xml files. so if you read the data of a .doc file using the code I posted it will not show you the text it will show you the result of the binary compression expressed as ascii characters.

Sorry. the point of all my posts was to explain that word is a special format that requires an office interlop to read.

That means, as you said in one of your post, that I need a new reference of "Microsoft Word Object Library", right?
http://www.vbforums.com/showpost.php?p=3114899&postcount=8 - this one?!
I was trying to do that refernece, but I got an error on Word refernece - a yellow exclamation mark. As I read this happens if I don`t have the SP3 installed, or someting. Right?

EDIT: I got the file which salves that problem. Word reference is no long in a yellow exclamation mark. So how do I got that reserved word "Word", with this: "using Microsoft.Office.Interop.Word;" ? With this I only got "Words".

Edited 7 Years Ago by Mitja Bonca: n/a

Where do you get:

Word.ApplicationClass wordApp = new Word.ApplicationClass();

I can only have:

ApplicationClass wordApp = new Word.ApplicationClass();

Edited 7 Years Ago by Mitja Bonca: n/a

object file = myFullPath;
                object nullobj = System.Reflection.Missing.Value;
                ApplicationClass wordApp = new ApplicationClass();
                Document doc = wordApp.Documents.Open(
                    ref file, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj,
                    ref nullobj, ref nullobj, ref nullobj);
                
                doc.ActiveWindow.Selection.WholeStory();
                doc.ActiveWindow.Selection.Copy();
                IDataObject data = Clipboard.GetDataObject();
                String myGetString = data.GetData(DataFormats.Text).ToString();
                //doc.Close();

I would like to know what are those ref file, ref nullobj for?

Edited 7 Years Ago by Mitja Bonca: n/a

Essentially, the method called in that library expects some object references from the MSWord application. to use the method you have to match that methods params, since you don't have those values, you have to pass null values.

the first param needs to be a reference to a string with the value of the file path, the nullobj is just as it says, its an object that is null. the method needs an object reference, so if you just pass null it will fail, so you create a null object and pass that.

the ApplicationClass creates an object that is the one word uses to create the editing window, so now you have this object that you can grab the text from. That's how it works.

if you are wondering how we know what values to pass, that would be microsoft word documentation. since its not a managed dll it doesn't share that information. When microsoft created the com object responsible they also wrote documentation for it. The original author of the code above got it from there. but that wasn't me.

If nothing else rename your document to a .zip, open it, extract the xml files, and you can extract the information with the XML classes. However using the MSWord references should be much easier...

If nothing else rename your document to a .zip, open it, extract the xml files, and you can extract the information with the XML classes. However using the MSWord references should be much easier...

I did manage to work it out. It works great. Thx guys.

This article has been dead for over six months. Start a new discussion instead.