Hi to all.

How can I easily convert HTML to well formed XML? Does anyone has an idea how I can do that? Thank you for your time.

Recommended Answers

All 3 Replies

I used HTMLTidy to do this converting HTML to XHTML. IT worked really well, it's normally used a command line tool, but as it's open source you could probably convert it into a library. I think it's written in C.

I also found this which shows how HTMLTidy can be integrated with ASP.Net

Hope that Helps

Hi to all.

How can I easily convert HTML to well formed XML? Does anyone has an idea how I can do that? Thank you for your time.

HTML is already XML. Anyway, What are you trying to do?

I used HTMLTidy to do this converting HTML to XHTML....

I don't think he is talking about XHTML. He's talking about XML.

Huh? I'm not sure if I get what you mean XHTML is XML, and HTML is not XML and HTML tidy converts from HTML to XHTML. My requirement was to parse HTML using XPath - which needs well formed XML to work and you just can't do that any other way.

If the question genuinely is to go from HTML -> Some other form of XML how about this approach.

HTML + HTMLTidy -> XHTML apply XSL -> Different XML.

Not the fastest way of doing this but highly standards based. Of course you could use an XmlReader/XmlWriter but even in that case you can't start until the HTML is well formed as XHTML.

HTML is already XML. Anyway, What are you trying to do?

I don't think he is talking about XHTML. He's talking about XML.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.