Hello friends, I have a small problem with parsing XML documents...
My program works great, but if some element is not exist in the XML then I got an exception error, and now I want to ask "How to check if the element is in the XML" ?

Here is my code:

from xml.dom.minidom import parseString
...
        doc = parseString( data )
        items = doc.getElementsByTagName('item')
        
        titles = []
        links = []
        
        for i in items:
            titles.append( i.getElementsByTagName('title')[0].firstChild.data )
            links.append( i.getElementsByTagName('link')[0].firstChild.data )

        return dict( {'title' : titles, 'link' : links } )

XML code should be:

<item>
    <title></title>
    <link></link>
</item>

...and now, if title or link is not in the item, then I got the exception error (index out of range)...
So, "How to check if the element is in the XML" ?

Thanks

You can test the number of children elements and the type of an element

data = """<item>
    <title>mytitle</title>
    <link></link>
</item>"""

from xml.dom.minidom import parseString, Element

def genSimpleData(element, tag):
    """yield the string data for all subelements with the given tag
    which have a single text node child"""
    for node in element.getElementsByTagName(tag):
        if len(node.childNodes) == 1 and node.firstChild.nodeType == Element.TEXT_NODE:
            yield node.firstChild.data

def main():
    doc = parseString( data )
    items = doc.getElementsByTagName('item')

    titles = []
    links = []

    for i in items:
        titles.extend(genSimpleData(i, 'title'))
        links.extend(genSimpleData(i, 'link'))

    return dict( {'title' : titles, 'link' : links } )

print main()

""" my output --->

{'link': [], 'title': [u'mytitle']}
"""

@Gribouillis, Thanks for your answer, I solved my problem with checking the length of the nodes

if len(node.getElementsByTagName(tag) ) != 1:
    # append empty string
else:
    # append node.getElementsByTagName(tag)[0].firstChild.data

Thank you again :)

This question has already been answered. Start a new discussion instead.