954,157 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Iterating and removing an element from ElementTree

Hi. I have an xml file like this:

<tree>
    <category title="Item 1">item 1 text
        <subitem title="subitem1">subitem1 text</fileitem>
        <subitem title="subitem2">subitem2 text</fileitem>
    </category>
    
    <category title="Item 2">item 2 text
        <subitem title="subitem21">subitem21 text</fileitem>
        <subitem title="subitem22">subitem22 text</fileitem>
    </category>
</tree>


I use ElementTree to parse this file and this works okay, however, I now need to remove an item based on its title.

I can loop through the data, but for the life of me I cant work out how to remove the item once I have matched it:

import xml.etree.cElementTree as ET
    ...
    ...
    ...
    self.xml = ET.parse(self.fpath)
    ...
    ...
    ...
    titem = "subitem21"
    xml = self.xml.getroot()
    iterator = xml.getiterator("subitem")
    for item in iterator:
        f = item
        text = item.attrib["title"]
        if text == titem:
            xml.remove(f)


If I use print statements I can see that I am actually finding the right item, but I just cant work out how to remove it, since it would seem that I should use either xml.remove() or item.remove(), and in either case I get an error stating that "x is not in the list", presumably saying that it cant find and delete the item that it just found!

Can anyone point me in the right direction please?

cheers

Max

MaxVK
Light Poster
46 posts since Nov 2008
Reputation Points: 10
Solved Threads: 1
 

According to documentation :
Unlike the findXYZ methods this method compares elements based on the instance identity, not on tag value or contents.
Which means that likely your iterator is returning the contents and not the instance itself.

jlm699
Veteran Poster
1,112 posts since Jul 2008
Reputation Points: 355
Solved Threads: 292
 

Hmm, okay, I can run with that - Now how do I get a reference to the instance rather than the contents?

MaxVK
Light Poster
46 posts since Nov 2008
Reputation Points: 10
Solved Threads: 1
 

Hmm, after taking another look at your method, you must be getting the instance and not just the contents since you're able to call the attrib method... Have you tried using remove(item) instead of remove(f) ?

jlm699
Veteran Poster
1,112 posts since Jul 2008
Reputation Points: 355
Solved Threads: 292
 

Yes. I get the same error, which is simply that the item is not in the list. Iv tried getting the element half a dozen ways, and so far that error is the closest that Iv come to removing an item.

If I just try to "print item" it shows that it is an element, but whichever way I treat it, I get the same error. Kind of driving me nuts now!

[edit]
More specifically I get "ValueError: list.remove(x): x not in list" as the error.

regards

Max

MaxVK
Light Poster
46 posts since Nov 2008
Reputation Points: 10
Solved Threads: 1
 

Okay, the problem has been solved with some help from others. The problem was that the item I wanted to delete was a child item, and I needed to use the parent to perform the remove:

parent.remove(child)

I ended up with some much simple code, that works nicely providing you keep in mind where in the structure the child to be removed actually is - This code works to remove direct Grandchildren of the Root item! (Of course its easy enough to expand it to remove any items, but Ill leave that to you!)

for x in xml:
            for y in x:
                if y.attrib["title"] == titem:
                    x.remove(y)
                    self.xml.write("output.xml")
                    return


I hope this helps someone.

regards

Max

MaxVK
Light Poster
46 posts since Nov 2008
Reputation Points: 10
Solved Threads: 1
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You