954,541 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

XML reading

I have this script and it was working a couple of weeks ago.

Now when i run it nothing is printed out other than white space. WHY???

#!/usr/bin/env python

from xml.etree import ElementTree as ET
import os
import urllib

def find_text(element):
        if element.text is None:
                for subelement in element:
                        for txt in find_text(subelement):
                                yield txt

        else:
                yield element.text

feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:
    tree = ET.parse(feed)
		
except Exception, inst:
    print "Unexpected error opening %s: %s" % (tree, inst)
    
root= tree.getroot()

for txt in find_text(root):
        print txt
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

Your implementation of find_text would only attempt to iterate to lower elements if the current element had no text.

Having text does not preclude having child nodes.

I rewrote your find_text as follows and I'm seeing output now:

def find_text(element):
        if element.text:
                yield element.text
                
        for subelement in element:
                for txt in find_text(subelement):
                        yield txt
Murtan
Practically a Master Poster
671 posts since May 2008
Reputation Points: 344
Solved Threads: 116
 

that works, thanks

just one problem it will work in my python shell but it wont work in my cgi bin on my website what have i missed

#!/usr/bin/env python
import elementtree.ElementTree as ET
import os
import urllib
print "Content-type: text/html\n"

def find_text(element):
      if element.text:
          yield element
          for subelement in element:
              for txt in find_text(subelement):
                  yield txt

feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:
    tree = ET.parse(feed)
		
except Exception, inst:
    print "Unexpected error opening %s: %s" % (tree, inst)
    
root= tree.getroot()
text = root.getchildren()

for item in text:
    if item.tag =="Memory":
        extra = "Memory"
    else:
        extra = "Other"

    for element in find_text(item):
        print extra+element.tag
adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

anyone?

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

You changed the indentation of for subelement in element: .

Gribouillis
Posting Maven
Moderator
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
 

??? that should make a difference

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

Yes because with your new code, the for loop is executed only if the element has a text, which is not what you want

Gribouillis
Posting Maven
Moderator
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
 

even if i remover the if statment all together nothing will be printed out. anything below the line of
feed = urllib.urlopen("http://server-up.theatticnetwork.net/demo/")
try:

wont print out

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

You should rewrite find_text as

def find_text(element):
    if element.text:
        yield element
    for subelement in element:
        for txt in find_text(subelement):
            yield txt

If you wrote it like this, I can't see how it could fail.

Gribouillis
Posting Maven
Moderator
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
 

I have done that and still the sodding same

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

i have contacted the hosting company and they said they got the following

================================= Content-type: text/html

adam Traceback (most recent call last): File "adam.py", line 20, in ? print "Unexpected error opening %s: %s" % (tree, inst) NameError: name 'tree' is not defined =================================

but i dont get that error. What is causing this problem?

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

It was the hosting company change the version of element tree module so i had to change the import to import cElementTree as ET

Thanks for all the help

Adam

adam291086
Junior Poster in Training
61 posts since Nov 2008
Reputation Points: 10
Solved Threads: 0
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You