0

I need to read and write an XML file which may contain '€' symbols. So, to start I run this code:

from xml.etree import ElementTree as ET
optionstree = ET.parse("test.conf")

where "test.conf" is the following XML file:

<?xml version="1.0" encoding="UTF-8"?>
<test>12€34</test>

and I get this error:

File "<pyshell#7>", line 1, in <module>
  optionstree = ET.parse("test.conf")
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1177, in parse
  tree.parse(source, parser)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 653, in parse
  parser.feed(data)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1624, in feed
  self._raiseerror(v)
File "C:\Python27\lib\xml\etree\ElementTree.py", line 1488, in _raiseerror
  raise err
ParseError: not well-formed (invalid token): line 2, column 8

How can I parse an XML file containing a '€' symbol using xml.etree?

2
Contributors
1
Reply
3
Views
5 Years
Discussion Span
Last Post by Gribouillis
0

It works for me with python 2.7.2 in linux. You could try this

from xml.etree import ElementTree as ET
from xml.etree.ElementTree import XMLParser
parser = XMLParser(encoding="utf-8")
optionstree = ET.parse("test.conf", parser=parser)

Edited by Gribouillis

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.