XML reading
I have this script and it was working a couple of weeks ago.
Now when i run it nothing is printed out other than white space. WHY???
#!/usr/bin/env python
from xml.etree import ElementTree as ET
import os
import urllib
def find_text(element):
if element.text is None:
for subelement in element:
for txt in find_text(subelement):
yield txt
else:
yield element.text
feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:
tree = ET.parse(feed)
except Exception, inst:
print "Unexpected error opening %s: %s" % (tree, inst)
root= tree.getroot()
for txt in find_text(root):
print txt
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
Your implementation of find_text would only attempt to iterate to lower elements if the current element had no text.
Having text does not preclude having child nodes.
I rewrote your find_text as follows and I'm seeing output now:
def find_text(element):
if element.text:
yield element.text
for subelement in element:
for txt in find_text(subelement):
yield txt
Murtan
Practically a Master Poster
671 posts since May 2008
Reputation Points: 344
Solved Threads: 116
that works, thanks
just one problem it will work in my python shell but it wont work in my cgi bin on my website what have i missed
#!/usr/bin/env python
import elementtree.ElementTree as ET
import os
import urllib
print "Content-type: text/html\n"
def find_text(element):
if element.text:
yield element
for subelement in element:
for txt in find_text(subelement):
yield txt
feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:
tree = ET.parse(feed)
except Exception, inst:
print "Unexpected error opening %s: %s" % (tree, inst)
root= tree.getroot()
text = root.getchildren()
for item in text:
if item.tag =="Memory":
extra = "Memory"
else:
extra = "Other"
for element in find_text(item):
print extra+element.tag
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
You changed the indentation of for subelement in element: .
Gribouillis
Posting Maven
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
??? that should make a difference
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
Yes because with your new code, the for loop is executed only if the element has a text, which is not what you want
Gribouillis
Posting Maven
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
You should rewrite find_text as
def find_text(element):
if element.text:
yield element
for subelement in element:
for txt in find_text(subelement):
yield txt
If you wrote it like this, I can't see how it could fail.
Gribouillis
Posting Maven
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
I have done that and still the sodding same
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
i have contacted the hosting company and they said they got the following
=================================
Content-type: text/html
adam
Traceback (most recent call last):
File "adam.py", line 20, in ?
print "Unexpected error opening %s: %s" % (tree, inst)
NameError: name 'tree' is not defined
=================================
but i dont get that error. What is causing this problem?
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
It was the hosting company change the version of element tree module so i had to change the import to import cElementTree as ET
Thanks for all the help
Adam
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0