there is an html file

<table>
<tr>
<td>ok
<strong>Sep 10</strong>
| <a href="ttt">Oct 10</a> 
| <a href="kkk">Dec 10</a> 
<table>
<tr>
<td>
123
</td>
<td>
567
</td>
</tr>
</table>
</td>
</tr>
</table>

when i open it with firefox,the output is :
ok Sep 10 | Oct 10 | Dec 10
123 567
what i want to get is
ok Sep 10 | Oct 10 | Dec 10

here is my xpath expression

xpath('/html/body/table/tr/td')

i get

<td>ok
<strong>Sep 10</strong>
| <a href="ttt">Oct 10</a> 
| <a href="kkk">Dec 10</a> 
<table><tr>
<td>
123
</td>
<td>
567
</td>
</tr></table>
</td>

how can i get :

ok
<strong>Sep 10</strong>
| <a href="ttt">Oct 10</a> 
| <a href="kkk">Dec 10</a>

Recommended Answers

All 4 Replies

can u keep complete code for xpath

In the html if u don't want "123 567" why did u kept it with in <td> </td>. Just simply remove those values.

/html/table[1]

/html/body/table/tr/td/node()[local-name() != 'table'][string-length(normalize-space(.)) != 0]

Used on this Input Tree

<html>
    <body>
        <table>
            <tr>
                <td>ok
                    <strong>Sep 10</strong>| <a href="ttt">Oct 10</a> 
                | <a href="kkk">Dec 10</a>
                    <table>
                        <tr>
                            <td>123</td>
                            <td>567</td>
                        </tr>
                    </table>
                </td>
            </tr>
        </table>
    </body>
</html>
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.