We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,874 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Split into array a table from word document.

Hi, Guys.

I have a issue reading a word document, well i can actually read the document but i need some how split some data into array, now the data that i want to split is into a table in word, is there an easy way to split that data into array from the table in word?. Because my big problem is that somehow when i read it, i get different result structure, what i mean is that sometimes i receive every single row in lines and others just as it is in columns. in summary my result from word it is not always the same so i cannot play with it. what i know is that the data i want is into a table.

I know office 2003 is saved in binary but 2007 saved in ooxml so is make any easy to work that way, i do not have any problem to convert all files into 2007 and work with that.

I hope you guys understand what my issue is thanks.

2
Contributors
8
Replies
4 Days
Discussion Span
3 Years Ago
Last Updated
9
Views
Question
Answered
jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 71
Solved Threads: 60
Skill Endorsements: 0

i think you need to provide some code, we need to know what library you use to work with word document.
if you are able to create ms word table using c#, you can read from table too.
http://msdn.microsoft.com/en-us/library/aa192483.aspx

serkan sendur
Postaholic
Banned
2,062 posts since Jan 2008
Reputation Points: 854
Solved Threads: 127
Skill Endorsements: 0

Thanks Serkan.
Base on your link, I came out with this code, which can read the table as I want, thanks again.

Microsoft.Office.Interop.Word.Application app = new Microsoft.Office.Interop.Word.ApplicationClass();
            app.Visible = false;
            object nullobj = System.Reflection.Missing.Value;
            object file = path;
            Microsoft.Office.Interop.Word.Document doc = app.Documents.Open(
            ref file, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj,
            ref nullobj, ref nullobj, ref nullobj, ref nullobj);
            Microsoft.Office.Interop.Word.Table tbl = doc.Tables[1];
            doc.ActiveWindow.Selection.WholeStory();
            doc.ActiveWindow.Selection.Copy();
            IDataObject data = Clipboard.GetDataObject();
            string text = data.GetData(DataFormats.Text).ToString();
            //with this I go for every single row, and my case the tables always have 5 columns.
            [B] for (int i = 1; i < tbl.Rows.Count; i++)
            {
                for (int a = 1; a < 5; a++)
                {
                    textBox1.Text += tbl.Cell(i, a).Range.Text + Environment.NewLine;
                }
            }[/B]

But now, I do not know why if I open the document with Microsoft Word I can see there is a table, but the programming not recognize the table, this just happen in some word documents not all. the things is that i have thousands of word documents that i have to go thru and get the info that i want.

anyway base on my first problem you did answer my question, but i would like to know any opinion about my second issue.

regards, thanks again.

jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 71
Solved Threads: 60
Skill Endorsements: 0

if you are searching for a string in some word documents, i think it is better to treat them as text files instead of word documents. open them as text files and search the text you are looking for, if you find the text, then open the document as word document to get the info you want. this way your search will be a lot faster.

serkan sendur
Postaholic
Banned
2,062 posts since Jan 2008
Reputation Points: 854
Solved Threads: 127
Skill Endorsements: 0

Yes, I understand your point, but the problem is that I do not have any specific parameter to search into the document, the data could be any, like a I said before the only thing I know is that the data i want is into table, but now the documents have different structure. what i thinking to do is, go for every single word document and verify if it have a table and then get my data and move that document to another location and see how many word document i have left. and pass the data manually to the database.

Thanks again for your help.

jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 71
Solved Threads: 60
Skill Endorsements: 0

so you dont know what you are searching for? how are you going to search then?

serkan sendur
Postaholic
Banned
2,062 posts since Jan 2008
Reputation Points: 854
Solved Threads: 127
Skill Endorsements: 0

Well what I mean is that I do not have like a specific string to search all the documents has data at least the header, now what i know is that column 1 = Drawing Revision for example in the table column2 = Paper Revision and so on. that is why I am interesting to go over every single row and column because for me it is easier to retrieve the values, sorry if I am not explaining this issue correctly but that is it, for my bad lucky there is no consistency in the word documents make the automatic read kind of difficult.
Thanks.

jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 71
Solved Threads: 60
Skill Endorsements: 0

to your second question all i can think of is that there could be some hidden tables in the document, so when you get the table collection with indexing, you may get the wrong one, that may be why although you see them visually, from the code you access the empty ones. make sure that your document only has that particular table , not empty ones or invisible ones.

serkan sendur
Postaholic
Banned
2,062 posts since Jan 2008
Reputation Points: 854
Solved Threads: 127
Skill Endorsements: 0

Ok, thanks for your help.

jbisono
Posting Pro in Training
442 posts since May 2009
Reputation Points: 71
Solved Threads: 60
Skill Endorsements: 0
Question Answered as of 3 Years Ago by serkan sendur

This question has already been solved: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
 
© 2013 DaniWeb® LLC
Page rendered in 0.0923 seconds using 2.68MB