for example:
the html file is as fllows:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
<title>HTML form </title>
</head>
</html>

then after the htmlparser the xml file is as fllows:

<?xml version="1.0" encoding="utf-8" ?>
- <pagestructure>
- <pageNodeList>
- <pageNode id="0" name="" tagName="html" parentNodeId="-1">
<nodeValue />
</pageNode>
- <pageNode id="1" name="" tagName="head" parentNodeId="0">
<nodeValue />
</pageNode>
- <pageNode id="2" name="" tagName="meta" parentNodeId="1">
- <nodeAttributeList>
<nodeAttribute id="0" attributeName="http-equiv" attributeValue="Content-Type" />
<nodeAttribute id="1" attributeName="content" attributeValue="text/html; charset=gb2312" />
</nodeAttributeList>
<nodeValue />
</pageNode>
- <pageNode id="3" name="" tagName="title" parentNodeId="1">
</pageNodeList>
</pagestructure>

that is to say:I want to give the pageNode an Id to take down.

If anyone of you has sample code please share with me,your suggestion greatly appreciated.
Thanks
luoyi2008061424

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.