a part of my script will read line by line of a large text file (about 120 lines) and needs to pick out the title. This is done by selecting everything between the 2 quotes. it will work for most of the file, then it will not. its seems to be giving me Unicode

this is what i get:

here is my open statement
f = codecs.open(f_str, encoding='utf-8')

line
u'\u201cGlobalHUB: A Virtual Community For Global Engineering Education, Research, And Collaboration,"'

but when i print the line
print line
“GlobalHUB: A Virtual Community For Global Engineering Education, Research, And Collaboration,"

it shows up correct.

i am not sure what the problem is. I have tried encoding and decoding till the cows come home. I have tried using a simple open command, but then i was left with hex code (i think).

Any suggestion?

\u201c is the unicode code for smart quotes which is why it slants. They come out like that because of how the file was originally written and won't change no matter how you read it. You can replace them with regular quotes using replace.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.