hello all,

im using the code below now to extract table of contents from a html page. it extracts the data which is available after the <table> tag. but what i need is to get the result which appear as though it appear on that html page. . is it possible. if possible means how can i do that. need help..

the code i ve used is,

private void button1_Click(object sender, EventArgs e)
        {
             webBrowser1.Navigate(@"d:\samtable.html");
       }

        private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            string textResult;
            foreach (HtmlElement pageElement in webBrowser1.Document.GetElementsByTagName("TABLE"))
            {
                textResult = pageElement.Children[0].InnerHtml;
                this.textBox1.Text = textResult;
            }
        }

im getting the result in textbox like this.

<TR>
<TD width=210>A</TD>
<TD width=45>B</TD>
<TD width=45>C</TD></TR>
<TR>
<TD width=210>D</TD>
<TD width=45>E</TD>
<TD width=45>F</TD></TR

Recommended Answers

All 6 Replies

Use "InnerText" if you don't want those tags to show up:

textResult = pageElement.Children[0].InnerText;

Thanks

foreach (HtmlElement pageElement in webBrowser1.Document.GetElementsByTagName("TABLE"))
            {
                textResult = pageElement.OuterHtml;
            }

Use "InnerText" if you don't want those tags to show up:

textResult = pageElement.Children[0].InnerText;

Thanks

thanks for replying,

ur code works fine.. and also i'd like to know how can we extract a particular table if there are multiple tables in a html page.

would be grateful if u help.
thanks

Repeat for each loop until the desire result is not found.

how can we extract a particular table if there are multiple tables in a html page.

That can be done the same way as the urls but the table must have something unique like unique width, id or something.

Let's say we've got lots of tables and one of them looks like this: <table width="123" height="345" id="tbID"> To get that table use:

HtmlElement theTable = null;

       foreach (HtmlElement element in webBrowser1.Document.GetElementsByTagName("td"))
     {
           if (element.Id == "tbID")
           {
              // this the table you need
              theTable = element;
              break;
           }
     }

If the table has unique width or height then use:

if (element.GetAttribute("width") == "123" && element.GetAttribute("height") == "345")
  theTable = element

Thanks

thank u all for replying,

i can get it with the id of the table but the problem now is one of the column in every row of my required table has the links. it does not specify the links in website it has displayed as an image on that site. i need to get that specified link as other row elements and store in a file. dont know how can i get it.. this is how the website row has,

<tr>

<td style="width:100px;">aaaaa</td>
<td>bbbbb</td>
<td>ccccc</td>
<td style="width:75px;">dddddd</td>
<td align="center">&nbsp;</td>
<td align="center">&nbsp;</td>
<td align="center">&nbsp;</td>

<td align="center">
<a id="samid" class="samclass" href="samfile.aspx?id=11111"><img src="images\samimg.png" style="border-width:0px;" /></a>
</td>
</tr>

this is how the website table row looks.. i can get other elements in the row now i want to get the source[colored red] link. and my extracted value should looks like
"aaaa bbbb cccc dddd samfile.aspx?id=11111"

need help. pls

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.