1,105,380 Community Members

Split into array a table from word document.

Member Avatar
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 51 [?]
Q&As Helped to Solve: 61 [?]
Skill Endorsements: 0 [?]
 
0
 

Hi, Guys.

I have a issue reading a word document, well i can actually read the document but i need some how split some data into array, now the data that i want to split is into a table in word, is there an easy way to split that data into array from the table in word?. Because my big problem is that somehow when i read it, i get different result structure, what i mean is that sometimes i receive every single row in lines and others just as it is in columns. in summary my result from word it is not always the same so i cannot play with it. what i know is that the data i want is into a table.

I know office 2003 is saved in binary but 2007 saved in ooxml so is make any easy to work that way, i do not have any problem to convert all files into 2007 and work with that.

I hope you guys understand what my issue is thanks.

serkan sendur
Postaholic
2,057 posts since Jan 2008
Reputation Points: 821 [?]
Q&As Helped to Solve: 129 [?]
Skill Endorsements: 1 [?]
Banned
 
0
 

i think you need to provide some code, we need to know what library you use to work with word document.
if you are able to create ms word table using c#, you can read from table too.
http://msdn.microsoft.com/en-us/library/aa192483.aspx

Member Avatar
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 51 [?]
Q&As Helped to Solve: 61 [?]
Skill Endorsements: 0 [?]
 
0
 

Thanks Serkan.
Base on your link, I came out with this code, which can read the table as I want, thanks again.

Microsoft.Office.Interop.Word.Application app = new Microsoft.Office.Interop.Word.ApplicationClass();
            app.Visible = false;
            object nullobj = System.Reflection.Missing.Value;
            object file = path;
            Microsoft.Office.Interop.Word.Document doc = app.Documents.Open(
            ref file, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj, ref nullobj);
            Microsoft.Office.Interop.Word.Table tbl = doc.Tables[1];
            doc.ActiveWindow.Selection.WholeStory();
            doc.ActiveWindow.Selection.Copy();
            IDataObject data = Clipboard.GetDataObject();
            string text = data.GetData(DataFormats.Text).ToString();
            //with this I go for every single row, and my case the tables always have 5 columns.
            [B] for (int i = 1; i < tbl.Rows.Count; i++)
            {
                for (int a = 1; a < 5; a++)
                {
                    textBox1.Text += tbl.Cell(i, a).Range.Text + Environment.NewLine;
                }
            }[/B]

But now, I do not know why if I open the document with Microsoft Word I can see there is a table, but the programming not recognize the table, this just happen in some word documents not all. the things is that i have thousands of word documents that i have to go thru and get the info that i want.

anyway base on my first problem you did answer my question, but i would like to know any opinion about my second issue.

regards, thanks again.

serkan sendur
Postaholic
2,057 posts since Jan 2008
Reputation Points: 821 [?]
Q&As Helped to Solve: 129 [?]
Skill Endorsements: 1 [?]
Banned
 
1
 

if you are searching for a string in some word documents, i think it is better to treat them as text files instead of word documents. open them as text files and search the text you are looking for, if you find the text, then open the document as word document to get the info you want. this way your search will be a lot faster.

Member Avatar
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 51 [?]
Q&As Helped to Solve: 61 [?]
Skill Endorsements: 0 [?]
 
0
 

Yes, I understand your point, but the problem is that I do not have any specific parameter to search into the document, the data could be any, like a I said before the only thing I know is that the data i want is into table, but now the documents have different structure. what i thinking to do is, go for every single word document and verify if it have a table and then get my data and move that document to another location and see how many word document i have left. and pass the data manually to the database.

Thanks again for your help.

serkan sendur
Postaholic
2,057 posts since Jan 2008
Reputation Points: 821 [?]
Q&As Helped to Solve: 129 [?]
Skill Endorsements: 1 [?]
Banned
 
0
 

so you dont know what you are searching for? how are you going to search then?

Member Avatar
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 51 [?]
Q&As Helped to Solve: 61 [?]
Skill Endorsements: 0 [?]
 
0
 

Well what I mean is that I do not have like a specific string to search all the documents has data at least the header, now what i know is that column 1 = Drawing Revision for example in the table column2 = Paper Revision and so on. that is why I am interesting to go over every single row and column because for me it is easier to retrieve the values, sorry if I am not explaining this issue correctly but that is it, for my bad lucky there is no consistency in the word documents make the automatic read kind of difficult.
Thanks.

serkan sendur
Postaholic
2,057 posts since Jan 2008
Reputation Points: 821 [?]
Q&As Helped to Solve: 129 [?]
Skill Endorsements: 1 [?]
Banned
 
0
 

to your second question all i can think of is that there could be some hidden tables in the document, so when you get the table collection with indexing, you may get the wrong one, that may be why although you see them visually, from the code you access the empty ones. make sure that your document only has that particular table , not empty ones or invisible ones.

Member Avatar
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 51 [?]
Q&As Helped to Solve: 61 [?]
Skill Endorsements: 0 [?]
 
0
 

Ok, thanks for your help.

Question Answered as of 4 Years Ago by serkan sendur
You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article