I know I can accomplish this with Regex, however, being not familiar with RegEX (yet to find a good howto/reference), I was wondering if there's other simple ways to parce XML files, or if there's a tutorial for regex somewhere

For example, I want to get the Hello World! from <test>Hello World!</test> Also, how would I get the attributes from the tags? Like <test name="hello">

Thanks

Recommended Answers

All 5 Replies

lol I also got the exact same thing after a quick google search. Still some questions though, I'm not exactly sure on what the difference between SAX and PyXML is

PyXML or sax should be better than using regex.
Since they are dedicated to parsing XML.

For simple lookup I gave this solution to other thread, of course special modules are maybe more powerful, but take some time to learn.

Any way here it again:

MyStr = "<test>some text here</test> <other> more text </other> <test> even more text</test>"

tlist = []
_,found,t = MyStr.partition('<test>')

while found:
    t,found,more = t.partition('</test>')
    if found:
        tlist.append(t) ## no assignment, just append
    else:
        raise ValueError, "Missing end tag: " + t
    
    _,found,t = more.partition('<test>')
    
print tlist

What I'm doing is sort of like a map, so therefore I probably HAVE to learn a dedicated Parcer.

Following a chapter 9 in Dive into Python. http://diveintopython.org/xml_processing/index.html

It's funny because I just got to chapter 9 today. I wouldnt of asked if I was there yesterday lol

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.