how to read the output of the html,

Actaully I am writing a small text as output whenever I invoke the html file

but when I am using urllib.read() or webbrowser.read() I am able to read the source of the html rather than its output.


I am a begineer So please kindly help me how to read the output of the html....

Recommended Answers

All 10 Replies

how to read the output of the html,

Actaully I am writing a small text as output whenever I invoke the html file

but when I am using urllib.read() or webbrowser.read() I am able to read the source of the html rather than its output.


I am a begineer So please kindly help me how to read the output of the html....

# for python 2.6
import urllib2
html = urllib2.urlopen('http://google.com').read()
print html

Edit: OR

filename = 'path\\to\\the\\html\\fil.html'
f = open(filename, "r").read()
print f

I haven't understood what you are trying to do. Make HTML editor? Read HTML? or what? Can you please elaborate more?

Even the two replies which you gave read the source of the html rather giving me the ouput

If suppose I am having a html file with a.html
<H1>
Hello World
</H1>

WHen I invoke this html file it would give me an output Hello World

So I want script which would give the output of the html file.

It is not abt removing the tags and again giving me the text in html,
It is abt giving the output of the html file..

So please kindly give me a script of such kind which would give the output of the html
Thanks in advance..

An html file has no "output", it's just an html file. There are programs to convert an html file to a text file or a pdf file, you should google for that.

Please convert the attached file to .html,

actually it contains js script

The content in the place of the word hello will be varying.
So, when I invoke the html file it would give me an exception or the word "OK"

So is there any script in python which would help me to invoke the html file read status of the file i.e either an exception or the status OK
and write into a file...

So you wan't to strip out non HTML tags and leave only HTML?

import BeautifulSoup as bs

html = """\
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
 <HEAD>
  <TITLE> Test JSON </TITLE>
  
   <script language="JavaScript">
      function checkJSON()
	  {
	  
	    try
	    {
			var v = eval(TextJSON.value);
			var OK='OK'
			document.write("OK");
			
	    }
	    catch (ex)
	    {
		  document.write("Error:"+ex);
          
	    }

		TextJSON.focus();
    }



	</script>
 </HEAD>
 <BODY >
 <table width="100%" border="0">
	 <tr>
	    <td>
           <textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">Hello</textarea> 
		  
        </td>
     </tr>
</table><script language="JavaScript">
checkJSON();
  </script>
</BODY>
</HTML>
"""

soup = bs.BeautifulSoup(html)
divs = soup.findAll('textarea')
children = divs[0].contents
print divs[0].string  # Hello

This find Hello in test12js.txt
Html dos not ever output anything as Gribouillis pointed out.
You parse html an find text like i did here.
You are better off learing more basic stuff about python and html.

I am sorry I am not able to put my question to you in a proper way
HTML 1:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
 <HEAD>
  <TITLE> Test JSON </TITLE>

   <script language="JavaScript">
      function checkJSON()
      {

        try
        {
            var v = eval(TextJSON.value);
            var OK='OK'
            document.write("OK");

        }
        catch (ex)
        {
          document.write("Error:"+ex);

        }

        TextJSON.focus();
    }



    </script>
 </HEAD>
 <BODY >
 <table width="100%" border="0">
     <tr>
        <td>
           <textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">Hello</textarea> 

        </td>
     </tr>
</table><script language="JavaScript">
checkJSON();
  </script>
</BODY>
</HTML>

HTML 2:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
 <HEAD>
  <TITLE> Test JSON </TITLE>

   <script language="JavaScript">
      function checkJSON()
      {

        try
        {
            var v = eval(TextJSON.value);
            var OK='OK'
            document.write("OK");

        }
        catch (ex)
        {
          document.write("Error:"+ex);

        }

        TextJSON.focus();
    }



    </script>
 </HEAD>
 <BODY >
 <table width="100%" border="0">
     <tr>
        <td>
           <textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">({type:'AAU', msgid:1265033798233, sel:0, gadinfo:[{
adid:316,adprt:40.0,dur:10,ef:'2009/06/03 10:00:00',et:'2012/06/03 12:00:00',imgfurl:'vaccumcleaner_h.jpg'
}]})</textarea> 

        </td>
     </tr>
</table><script language="JavaScript">
checkJSON();
  </script>
</BODY>
</HTML>

If you invoke these two html files you will observe an exception in the first one and an OK message for the second one.

Please save these files and try.
So actually the content in the text area will be varying is what I said.

So I would like to know is there any script which would give me the exception if I call html1 and ok status if I call html2 and save to a text file the exception or the Ok respectively as called.

Hope now I am clear with what I want

It's not at all clear, what do you mean when you say that you invoke an html file ?

It's not at all clear, what do you mean when you say that you invoke an html file ?

I mean Invoke in the sense open the html file in the browser

For Example When I use the webbrowser.open() the html file will be opened in the browser. so now after opening in the browser then I will see either the exception or the ok as per the html file

so I want the script to even track that result and write into a text or either print in the shell...

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.