I currently have an XML (pasted below) that I want to place into a dictonary. With each dictonary key being the content of the XML`s <TITLE> markup and the values containing the content of the XML`s <ARTISTS*> markup.

<EXAMPLE>
  <CD>
    <TITLE>TITLE1</TITLE>
    <ARTIST1>ARTIST-ABC</ARTIST1>
    <ARTIST2>ARTIST-DEF</ARTIST2>
  </CD>
  <CD>
    <TITLE>TITLE2</TITLE>
    <ARTIST1>ARTIST-XYZ</ARTIST1>
  </CD>
  <CD>
    <TITLE>TITLE3</TITLE>
    <ARTIST1>ARTIST-GHJ</ARTIST1>
    <ARTIST1>ARTIST-123</ARTIST1>
    <ARTIST1>ARTIST-345</ARTIST1>
    <ARTIST1>ARTIST-987</ARTIST1>
  </CD>
</EXAMPLE>

Thanks in advance....

Recommended Answers

All 5 Replies

I already do that in my text game engine. you can check out some earlier code from the repo. the link is in my signature. It's quite simple actually. especially y our format. It's almost identical to the tag engine format.

(shameless plug... :p)

Thanks. Though still being exteremly new to python it still all looks pretty daunting.......

If you are new to Python and programming, then I think your task is a pretty big initial step. If just Python, then welcome to the light side: Python is way cool once you get into it.

Assuming the latter, then have a look at the expat example which shows you that, as each xml entity is seen, the appropriate callback is passed the appropriate part of the xml document. This allows you to easily handle the interesting bits, skip the bits that are not interesting in this context, and not keep the whole DOM in memory at the same time. Sax library gives you approximately the same thing. I don't know about Beautiful Soup. The example code is actually already useful (a little bit) because it can be used as is, or slightly extended to look at your document one chunk at a time, and seeing those chunks, in order, can help you understand how to write the rest of the code. I usually find it useful to leave the print lines in place, but slightly modified:

verbose = False
def dribble(*args):
  """Drool on screen in verbose mode only"""
  if verbose:
     print(*args)
# ...
# print something #commented out and replaced with
dribble(something)
# ...

Unless you have existing XML, or programs providing you with the XML, may I encourage you to use something else? See this amusing blog (disclaimer: With the exception of the 'American' one, I've been there and done that... and if you allow me to replace 'American' with some offshore programming shop or other, I've done that one too. It is a joke because sometimes it is indeed the right thing to do, and jokes help you cope.). Have you considered json?

Final thought, a warning: If the input XML is very regular, you will be tempted to use a simpler 'home grown' parser, perhaps based on regex. That would be a mistake for at least three reasons:

  1. You have to write it, debug it, document it and maintain it yourself
  2. Whoever supplies the XML could change the output without changing the 'meaning' (white space is optional in many places in XML, and order is mostly irrelevant)
  3. The next programmer who has to cope with your code (might even be you, a year or two later) will be better served if the API is standard, not home grown.
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.