943,147 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Unsolved
  • Views: 1113
  • Python RSS
You are currently viewing page 1 of this multi-page discussion thread
Feb 8th, 2010
0

HOW to read the html file

Expand Post »
how to read the output of the html,

Actaully I am writing a small text as output whenever I invoke the html file

but when I am using urllib.read() or webbrowser.read() I am able to read the source of the html rather than its output.


I am a begineer So please kindly help me how to read the output of the html....
Last edited by vamsicoolman; Feb 8th, 2010 at 1:31 am.
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
vamsicoolman is offline Offline
14 posts
since Feb 2010
Feb 8th, 2010
0
Re: HOW to read the html file
how to read the output of the html,

Actaully I am writing a small text as output whenever I invoke the html file

but when I am using urllib.read() or webbrowser.read() I am able to read the source of the html rather than its output.


I am a begineer So please kindly help me how to read the output of the html....
Python Syntax (Toggle Plain Text)
  1. # for python 2.6
  2. import urllib2
  3. html = urllib2.urlopen('http://google.com').read()
  4. print html

Edit: OR

Python Syntax (Toggle Plain Text)
  1. filename = 'path\\to\\the\\html\\fil.html'
  2. f = open(filename, "r").read()
  3. print f
Last edited by Krstevski; Feb 8th, 2010 at 10:49 am.
Reputation Points: 17
Solved Threads: 5
Junior Poster
Krstevski is offline Offline
110 posts
since May 2009
Feb 8th, 2010
0
Re: HOW to read the html file
I haven't understood what you are trying to do. Make HTML editor? Read HTML? or what? Can you please elaborate more?
Reputation Points: 462
Solved Threads: 392
Senior Poster
evstevemd is offline Offline
3,681 posts
since Jun 2007
Feb 9th, 2010
0

This also would do the same functionality

Even the two replies which you gave read the source of the html rather giving me the ouput

If suppose I am having a html file with a.html
<H1>
Hello World
</H1>

WHen I invoke this html file it would give me an output Hello World

So I want script which would give the output of the html file.

It is not abt removing the tags and again giving me the text in html,
It is abt giving the output of the html file..

So please kindly give me a script of such kind which would give the output of the html
Thanks in advance..
Last edited by vamsicoolman; Feb 9th, 2010 at 3:29 am.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
vamsicoolman is offline Offline
14 posts
since Feb 2010
Feb 9th, 2010
0
Re: HOW to read the html file
An html file has no "output", it's just an html file. There are programs to convert an html file to a text file or a pdf file, you should google for that.
Reputation Points: 930
Solved Threads: 666
Posting Maven
Gribouillis is offline Offline
2,655 posts
since Jul 2008
Feb 9th, 2010
0
Re: HOW to read the html file
Please convert the attached file to .html,

actually it contains js script

The content in the place of the word hello will be varying.
So, when I invoke the html file it would give me an exception or the word "OK"

So is there any script in python which would help me to invoke the html file read status of the file i.e either an exception or the status OK
and write into a file...
Attached Files
File Type: txt test12js.txt (714 Bytes, 32 views)
Reputation Points: 10
Solved Threads: 0
Newbie Poster
vamsicoolman is offline Offline
14 posts
since Feb 2010
Feb 10th, 2010
0
Re: HOW to read the html file
So you wan't to strip out non HTML tags and leave only HTML?
Reputation Points: 462
Solved Threads: 392
Senior Poster
evstevemd is offline Offline
3,681 posts
since Jun 2007
Feb 10th, 2010
0
Re: HOW to read the html file
Python Syntax (Toggle Plain Text)
  1. import BeautifulSoup as bs
  2.  
  3. html = """\
  4. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
  5. <HTML>
  6. <HEAD>
  7. <TITLE> Test JSON </TITLE>
  8.  
  9. <script language="JavaScript">
  10. function checkJSON()
  11. {
  12.  
  13. try
  14. {
  15. var v = eval(TextJSON.value);
  16. var OK='OK'
  17. document.write("OK");
  18.  
  19. }
  20. catch (ex)
  21. {
  22. document.write("Error:"+ex);
  23.  
  24. }
  25.  
  26. TextJSON.focus();
  27. }
  28.  
  29.  
  30.  
  31. </script>
  32. </HEAD>
  33. <BODY >
  34. <table width="100%" border="0">
  35. <tr>
  36. <td>
  37. <textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">Hello</textarea>
  38.  
  39. </td>
  40. </tr>
  41. </table><script language="JavaScript">
  42. checkJSON();
  43. </script>
  44. </BODY>
  45. </HTML>
  46. """
  47.  
  48. soup = bs.BeautifulSoup(html)
  49. divs = soup.findAll('textarea')
  50. children = divs[0].contents
  51. print divs[0].string # Hello
This find Hello in test12js.txt
Html dos not ever output anything as Gribouillis pointed out.
You parse html an find text like i did here.
You are better off learing more basic stuff about python and html.
Reputation Points: 280
Solved Threads: 278
Master Poster
snippsat is offline Offline
770 posts
since Aug 2008
Feb 10th, 2010
0
Re: HOW to read the html file
I am sorry I am not able to put my question to you in a proper way
HTML 1:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE> Test JSON </TITLE>

<script language="JavaScript">
function checkJSON()
{

try
{
var v = eval(TextJSON.value);
var OK='OK'
document.write("OK");

}
catch (ex)
{
document.write("Error:"+ex);

}

TextJSON.focus();
}



</script>
</HEAD>
<BODY >
<table width="100%" border="0">
<tr>
<td>
<textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">Hello</textarea>

</td>
</tr>
</table><script language="JavaScript">
checkJSON();
</script>
</BODY>
</HTML>

HTML 2:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
<TITLE> Test JSON </TITLE>

<script language="JavaScript">
function checkJSON()
{

try
{
var v = eval(TextJSON.value);
var OK='OK'
document.write("OK");

}
catch (ex)
{
document.write("Error:"+ex);

}

TextJSON.focus();
}



</script>
</HEAD>
<BODY >
<table width="100%" border="0">
<tr>
<td>
<textarea rows="15" cols="70" id="TextJSON" Style = "visibility:hidden">({type:'AAU', msgid:1265033798233, sel:0, gadinfo:[{
adid:316,adprt:40.0,dur:10,ef:'2009/06/03 10:00:00',et:'2012/06/03 12:00:00',imgfurl:'vaccumcleaner_h.jpg'
}]})</textarea>

</td>
</tr>
</table><script language="JavaScript">
checkJSON();
</script>
</BODY>
</HTML>

If you invoke these two html files you will observe an exception in the first one and an OK message for the second one.

Please save these files and try.
So actually the content in the text area will be varying is what I said.

So I would like to know is there any script which would give me the exception if I call html1 and ok status if I call html2 and save to a text file the exception or the Ok respectively as called.

Hope now I am clear with what I want
Last edited by vamsicoolman; Feb 10th, 2010 at 5:15 am.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
vamsicoolman is offline Offline
14 posts
since Feb 2010
Feb 10th, 2010
0
Re: HOW to read the html file
It's not at all clear, what do you mean when you say that you invoke an html file ?
Reputation Points: 930
Solved Threads: 666
Posting Maven
Gribouillis is offline Offline
2,655 posts
since Jul 2008

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: Voice Chat Application help
Next Thread in Python Forum Timeline: How would I link to Widgets in PyQt4





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC