I need to read and write an XML file which may contain '€' symbols. So, to start I run this code:

from xml.etree import ElementTree as ET
optionstree = ET.parse("test.conf")

where "test.conf" is the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<test>12€34</test>

and I get this error:

File "<pyshell#7>", line 1, in <module>
  optionstree = ET.parse("test.conf")
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1177, in parse
  tree.parse(source, parser)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 653, in parse
  parser.feed(data)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1624, in feed
  self._raiseerror(v)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1488, in _raiseerror
  raise err
ParseError: not well-formed (invalid token): line 2, column 8

How can I parse an XML file containing a '€' symbol using xml.etree?

It works for me with python 2.7.2 in linux. You could try this

from xml.etree import ElementTree as ET
from xml.etree.ElementTree import XMLParser
parser = XMLParser(encoding="utf-8")
optionstree = ET.parse("test.conf", parser=parser)
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.