I used HTMLTidy to do this converting HTML to XHTML. IT worked really well, it's normally used a command line tool, but as it's open source you could probably convert it into a library. I think it's written in C.
I also found this which shows how HTMLTidy can be integrated with ASP.Net
Hope that Helps
Hi to all.
How can I easily convert HTML to well formed XML? Does anyone has an idea how I can do that? Thank you for your time.
Huh? I'm not sure if I get what you mean XHTML is XML, and HTML is not XML and HTML tidy converts from HTML to XHTML. My requirement was to parse HTML using XPath - which needs well formed XML to work and you just can't do that any other way.
If the question genuinely is to go from HTML -> Some other form of XML how about this approach.
HTML + HTMLTidy -> XHTML apply XSL -> Different XML.
Not the fastest way of doing this but highly standards based. Of course you could use an XmlReader/XmlWriter but even in that case you can't start until the HTML is well formed as XHTML.
HTML is already XML. Anyway, What are you trying to do?
I don't think he is talking about XHTML. He's talking about XML.