Hi,

I am using following code to extrcat Text from .doc file.

Code::

FileStream fileStream = new FileStream("F:\\Resume_Rajib_Ghosal.doc", FileMode.Open, FileAccess.Read, FileShare.None);
        StreamReader srd = new StreamReader(fileStream);
        while (srd.Read() > 0)
        {
            string text = srd.ReadToEnd();
        }
        srd.Close();

But aftering extracting when i search kewords as xml,hidden,control,form,html as so on.., its not working properly. I mean to say in original file if i search xml kewords then they have no text. But in text if i search xml kewords then they have multiple values. And also it increment to content Length of .doc file and it contain invalid characters as � � 8��i �i �BN�� .

Give me some better resolution.

Thanks in advance.

Pankaj

You need to use office-InterOp API to read word document.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.