954,591 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

XHTML Complient parser?

Is there a program (such as Dreamweaver, Visual Studio.NET, etc) that will parse my HTML code and make it XHTML complient?

For example, I'd like to throw at it all of the pages of TechTalk forums and have it automatically convert, for example

<input type=submit name=submit value=Go>


to

<input type="submit" name="submit" value="Go" />
cscgal
The Queen of DaniWeb
Administrator
19,433 posts since Feb 2002
Reputation Points: 1,474
Solved Threads: 230
 

I figured out that Dreamweaver MX will, indeed, do this. :) But it only works for XHTML 1.0, and XHTML 2.0 was just announced a lil while ago. Maybe I should wait until a Dreamweaver plugin/update is available??

cscgal
The Queen of DaniWeb
Administrator
19,433 posts since Feb 2002
Reputation Points: 1,474
Solved Threads: 230
 

Nevermind, this makes no sense. When I take an HTML document and click on "convert to XHTML" all it does is add the following to the head:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http&#58;//www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http&#58;//www.w3.org/1999/xhtml">


It doesn't make any other changes?

cscgal
The Queen of DaniWeb
Administrator
19,433 posts since Feb 2002
Reputation Points: 1,474
Solved Threads: 230
 

I don't know about converting your code to XHTML, but I know that all code in VS.NET (with reguard to ASP.NET Pages) are in XHTML format.

Tekmaven
Software Architect
Moderator
1,274 posts since Feb 2002
Reputation Points: 322
Solved Threads: 28
 

You can give XML Spy a try. Definitely an awesome program to do any XML development. I used this for a JSP shopping cart. You can give HTML Tidy a try but I'm not sure of its capabilities (have not tried it, though I hear is good). It's a free program under SourgeForge unlike XML Spy. Here are the web sites:

http://www.altova.com/products_ide.html
http://tidy.sourceforge.net/

If you need to convert to XHTML you can try this PHP function. It has some quirkiness at times. This is what I used when I was redesigning the Hofstra CSC web site for the CSC club.

<? 

if &#40;!empty&#40;$type&#41;&#41; &#123; 

 if &#40;$type == "path"&#41; &#123; 
  if &#40;!empty&#40;$path&#41;&#41; &#123; 
   if &#40;file_exists&#40;$path&#41; && is_file&#40;$path&#41;&#41; &#123; 
    $file = file&#40;$path&#41;; 
    if &#40;substr&#40;$file&#91;0&#93;,0,9&#41; != "<!DOCTYPE"&#41; $doctype=0; 
    $file = join&#40;'', $file&#41;; 
   &#125; else &#123; 
    die &#40;"No such file."&#41;; 
   &#125; 
  &#125; else &#123; 
   die &#40;"No file specified."&#41;; 
  &#125; 
 &#125; elseif &#40;$type == "file"&#41; &#123; 
  if &#40;!empty&#40;$file&#41;&#41; &#123; 
    
  &#125; else &#123; 
   die &#40;"No file specified."&#41;; 
  &#125; 
 &#125; else &#123; 
  die &#40;"No file specified."&#41;; 
 &#125; 

 # specify html file, check for doctype 
 //$file = file&#40;"file.html"&#41;; 
 //if &#40;substr&#40;$file&#91;0&#93;,0,9&#41; != "<!DOCTYPE"&#41; $doctype=1; 
 //$file = join&#40;'', $file&#41;; 

 # make tags and properties lower case, close empty elements, quote all properties 
 $search  = array &#40;"'&#40;<\/?&#41;&#40;\w+&#41;&#40;&#91;^>&#93;*>&#41;'e", 
                   "'&#40;<\/?&#41;&#40;br|input|meta|link|img&#41;&#40;&#91;^>&#93;*&#41;&#40; />&#41;'ie", 
                   "'&#40;<\/?&#41;&#40;br|input|meta|link|img&#41;&#40;&#91;^>&#93;*&#41;&#40;/>&#41;'ie", 
                   "'&#40;<\/?&#41;&#40;br|input|meta|link|img&#41;&#40;&#91;^>&#93;*&#41;&#40;>&#41;'ie", 
                   "'&#40;\w+=&#41;&#40;\w+&#41;'ie", 
                   "'&#40;\w+=&#41;&#40;.+?&#41;'ie"&#41;; 
 $replace = array &#40;"'\\1'.strtolower&#40;'\\2'&#41;.'\\3'", 
                   "'\\1\\2\\3>'", 
                   "'\\1\\2\\3>'", 
                   "'\\1\\2\\3 /\\4'", 
                   "strtolower&#40;'\\1'&#41;.'\"\\2\"'", 
                   "strtolower&#40;'\\1'&#41;.'\\2'"&#41;; 
 $file = preg_replace&#40;$search, $replace, $file&#41;; 

 # return xhtml-compliant document 
 echo "<textarea cols=\"100\" rows=\"20\">"; 
 if &#40;isset&#40;$doctype&#41;&#41; echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">'."\n"; 
 echo stripslashes&#40;stripslashes&#40;stripslashes&#40;$file&#41;&#41;&#41;; 
 echo "</textarea>"; 

&#125; else &#123; 
?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> 
<head><title>HTML -> XHTML Convertor</title></head> 

<body> 

<!-- WARNING&#58; this input method is a security risk on open servers //--> 
<form action="<?=$PHP_SELF?>" method="get"> 
<input type="hidden" name="type" value="path" /> 
<font face="verdana">File path&#58;</font> <input type="text" name="path" size="50" /> 
<input type="submit" value="Submit" /> 
</form> 

<b><font face="verdana">OR</font></b> 

<form action="<?=$PHP_SELF?>" method="get"> 
<input type="hidden" name="type" value="file" /> 
<font face="verdana">File contents&#58;</font> 
<textarea name="file" rows="10" cols="50"></textarea> 
<input type="submit" value="Submit" /> 
</form> 

</body> 

</html> 
<? 
&#125; 
?>
samaru
a.k.a inscissor
Team Colleague
1,256 posts since Feb 2002
Reputation Points: 262
Solved Threads: 18
 

I found a Dreamweaver MX extension which does the trick perfectly. The only problem is that it automatically adds etc tags to the top and bottom of my code, to make it "complete".

The problem with this is that this forum uses a templating system, in which the top and bottom are shared borders. Therefore, I'd have to manually remove the top and bottom code from each page (and there are a LOT of pages)!

I'm procrastinating doing it for now. Debating whether it's worth my time or if next month I'll have a whole new design going (at which time I'll make it XHTML complient right from the start).

cscgal
The Queen of DaniWeb
Administrator
19,433 posts since Feb 2002
Reputation Points: 1,474
Solved Threads: 230
 

Hi, The coding is fine in PHP,
But it would be fine if some co
uld provide me with the XHTML Parser and
XHTML Generator startup source code in C ? Or
Could give An Opensource XHTML Generator and XTHML Parser in C .

Thanks & Regards,
karthik

You can give XML Spy a try. Definitely an awesome program to do any XML development. I used this for a JSP shopping cart. You can give HTML Tidy a try but I'm not sure of its capabilities (have not tried it, though I hear is good). It's a free program under SourgeForge unlike XML Spy. Here are the web sites:

http://www.altova.com/products_ide.html http://tidy.sourceforge.net/

If you need to convert to XHTML you can try this PHP function. It has some quirkiness at times. This is what I used when I was redesigning the Hofstra CSC web site for the CSC club.

<? 

if (!empty($type)) { 

 if ($type == "path") { 
  if (!empty($path)) { 
   if (file_exists($path) && is_file($path)) { 
    $file = file($path); 
    if (substr($file[0],0,9) != "<!DOCTYPE") $doctype=0; 
    $file = join('', $file); 
   } else { 
    die ("No such file."); 
   } 
  } else { 
   die ("No file specified."); 
  } 
 } elseif ($type == "file") { 
  if (!empty($file)) { 
    
  } else { 
   die ("No file specified."); 
  } 
 } else { 
  die ("No file specified."); 
 } 

 # specify html file, check for doctype 
 //$file = file("file.html"); 
 //if (substr($file[0],0,9) != "<!DOCTYPE") $doctype=1; 
 //$file = join('', $file); 

 # make tags and properties lower case, close empty elements, quote all properties 
 $search  = array ("'(<\/?)(\w+)([^>]*>)'e", 
                   "'(<\/?)(br|input|meta|link|img)([^>]*)( />)'ie", 
                   "'(<\/?)(br|input|meta|link|img)([^>]*)(/>)'ie", 
                   "'(<\/?)(br|input|meta|link|img)([^>]*)(>)'ie", 
                   "'(\w+=)(\w+)'ie", 
                   "'(\w+=)(.+?)'ie"); 
 $replace = array ("'\\1'.strtolower('\\2').'\\3'", 
                   "'\\1\\2\\3>'", 
                   "'\\1\\2\\3>'", 
                   "'\\1\\2\\3 /\\4'", 
                   "strtolower('\\1').'\"\\2\"'", 
                   "strtolower('\\1').'\\2'"); 
 $file = preg_replace($search, $replace, $file); 

 # return xhtml-compliant document 
 echo "<textarea cols=\"100\" rows=\"20\">"; 
 if (isset($doctype)) echo '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd">'."\n"; 
 echo stripslashes(stripslashes(stripslashes($file))); 
 echo "</textarea>"; 

} else { 
?> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "DTD/xhtml1-transitional.dtd"> 
<head><title>HTML -> XHTML Convertor</title></head> 

<body> 

<!-- WARNING: this input method is a security risk on open servers //--> 
<form action="<?=$PHP_SELF?>" method="get"> 
<input type="hidden" name="type" value="path" /> 
<font face="verdana">File path:</font> <input type="text" name="path" size="50" /> 
<input type="submit" value="Submit" /> 
</form> 

<b><font face="verdana">OR</font></b> 

<form action="<?=$PHP_SELF?>" method="get"> 
<input type="hidden" name="type" value="file" /> 
<font face="verdana">File contents:</font> 
<textarea name="file" rows="10" cols="50"></textarea> 
<input type="submit" value="Submit" /> 
</form> 

</body> 

</html> 
<? 
} 
?>


:cool: :cool:Opensource XHTML Generator and XHTML Parser in C .

linladen
Newbie Poster
4 posts since May 2004
Reputation Points: 10
Solved Threads: 0
 

If you're still looking, google "HTML Tidy". If that fails, it's on the W3C site somewhere.

Innocent
Newbie Poster
5 posts since Jul 2004
Reputation Points: 10
Solved Threads: 1
 

Hi,
In the past few days i have done enough R&D in this arena.
Has anyone come across / used a XHTML Parser generator
tool in Opensource community developed in C ?
Do Kindly post the link / tool name .

I saw Amaya but it is big and will consume time
to get just the XHTML Parser / Generator from it.

1) GENX is XML Parser and No XHTML Parser / Generator
tool
developed using GENX by GENX till now.

2) EXPAT doesnt provide a opensource XHTML Parser /
Generator till today.

3) X-Smiles is simple and good but Java Based .

4) LibXml is also a library and no XHTML parser /
Generator Tool by them in opensource till today.

Has Someone got a simple setup with just
the XHTML Parser and Generator alone developed in C
available in OpenSource
or
A tool Develped Using the above library tools ?

Kindly let me know And Do Give me your link.

Thanks & Regards,
karthik bala guru:confused:

If you're still looking, google "HTML Tidy". If that fails, it's on the W3C site somewhere.



Hi,
In the past few days i have done enough R&D in this arena.
Has anyone come across / used a XHTML Parser generator
tool in Opensource community developed in C ?
Do Kindly post the link / tool name .

I saw Amaya but it is big and will consume time
to get just the XHTML Parser / Generator from it.

1) GENX is XML Parser and No XHTML Parser / Generator
tool
developed using GENX by GENX till now.

2) EXPAT doesnt provide a opensource XHTML Parser /
Generator till today.

3) X-Smiles is simple and good but Java Based .

4) LibXml is also a library and no XHTML parser /
Generator Tool by them in opensource till today.

Has Someone got a simple setup with just
the XHTML Parser and Generator alone developed in C
available in OpenSource
or
A tool Develped Using the above library tools ?

Kindly let me know And Do Give me your link.

Thanks & Regards,
karthik bala guru:)

linladen
Newbie Poster
4 posts since May 2004
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You