I have wrote a simple code in python

s = b'B1=A\xF1adir+al+carrito\n'.decode('latin-1')
print(s)
with open ('lat.txt','wb') as f:
    f.write(bytes(s,'latin-1'))

the output is B1=Añadir+al+carrito and the content of the file is also the same.

but when I try to read from a file (with this content B1=A\xF1adir+al+carrito )

for lines in open('mytxt1.txt','rb'):
      print(lines)
      s = lines.decode('latin-1')
      print(s)
      with open ('lat1.txt','wb') as f:
         f.write(bytes(s,'latin-1'))

I don't get the output B1=Añadir+al+carrito but instead I get B1=A\xF1adir+al+carrito,
and the file empty,
any idea whta should I do?

Edited 1 Year Ago by Zeinab_1: couldn't insert code at first place

Use the codecs' encoding parameter when reading and writing, although you can do it manually yourself, I find that this method works without problems. Also, note that how it prints depends on the default encoding of your OS.

import codecs

s = b'B1=A\xF1adir+al+carrito\n'.decode('latin-1')
with codecs.open('lat.txt', mode="wb", encoding='latin-1') as fp:
    fp.write(s)

with codecs.open('lat.txt', "r", encoding='latin-1') as fp:
    r=fp.read()

print s
print r

Edited 1 Year Ago by woooee

Note that in python 3, the built-in function open() has an encoding parameter. You don't need to use codecs.open(). Otherwise, use io.open() for cross-python code :)

Python 3.4.0 (default, Jun 19 2015, 14:20:21) 
>>> import codecs
>>> codecs.open is open
False
>>> import io
>>> io.open is open
True

Edited 1 Year Ago by Gribouillis

Also, strings in Python 3 are unicode so enocde and decode are not necessary.

That's only true if text is already inside Python 3.
Here we are talking about taking text from outside into Python 3,
then we must define a encoding like utf-8,latin-1...,
or it will give an error or become a byte string.

Because we must read with correct encoding when taking text into Python 3,
with open() has new stuff like errors='ignore', errors='replace'

with open('some_file', 'r', encoding='utf-8', errors='ignore') as f:
    print(f.read())

So this statement.
In Python 3 are all strings are sequences of Unicode character
Yes this is true,
but then all text taken in from outside must have been correct encoded into Python 3.

Edited 1 Year Ago by snippsat

FYI ...

# get your current locale encoding

import locale

print(locale.getpreferredencoding(False))  # eg. US-ASCII

As of Python 3.4.3 these are the options with open() ...
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

I you use a default encoding of None, then your current locale ancoding is applied. Also your default for mode is really 'r' and text 't'

This article has been dead for over six months. Start a new discussion instead.