Dear web gods:

After much, much, much struggle with unicode, many an hour reading all the examples online, coding them, testing them, ripping them apart and putting them back together, I am humbled. Therefore, I humble myself before you to seek guidance on a simple python unicode cgi-bin scripting problem.

My problem is more complex than this, but how about I boil down one sticking point for starters. I have a file with a Spanish word in it, "años", which I wish to read with:

#!C:/Program Files/Python23/python.exe

STARTHTML= u'''Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="" lang="en" xml:lang="en">
ENDHTML = u'''
print open('c:/test/spanish.txt','r').read() 

Instead of seeing "año" I see "a�o". BAD BAD BAD
Yet, if I open the file with the browser (IE/Mozilla), I see "año." THIS IS WHAT I WANT


Next, I'll get into codecs and stuff, but how about starting with this?

The general question is, does anybody have a complete working example of a cgi-bin script that does the above properly that they'd be willing to share? I've tried various examples online but haven't been able to get any to work. I end up seeing hex code for the non-ascii characters u'a\xf1o', and later on 'a\xc3\xb1o', which are also BAD BAD BAD.

Thanks -- your humble supplicant.

Hi Kath, thanks for the reply. Yes, I've read that already ... and many others, besides. I may have made a little breakthrough in identifying the root of my problem. But I still need a solution. Please see the thread listed above, which details the problem as I see it now. Thx again.


read this. This might help you.


This article has been dead for over six months. Start a new discussion instead.