I have wrote a simple code in python

s = b'B1=A\xF1adir+al+carrito\n'.decode('latin-1')
print(s)
with open ('lat.txt','wb') as f:
    f.write(bytes(s,'latin-1'))

the output is B1=Añadir+al+carrito and the content of the file is also the same.

but when I try to read from a file (with this content B1=A\xF1adir+al+carrito )

for lines in open('mytxt1.txt','rb'):
      print(lines)
      s = lines.decode('latin-1')
      print(s)
      with open ('lat1.txt','wb') as f:
         f.write(bytes(s,'latin-1'))

I don't get the output B1=Añadir+al+carrito but instead I get B1=A\xF1adir+al+carrito,
and the file empty,
any idea whta should I do?

Recommended Answers

All 5 Replies

Use the codecs' encoding parameter when reading and writing, although you can do it manually yourself, I find that this method works without problems. Also, note that how it prints depends on the default encoding of your OS.

import codecs

s = b'B1=A\xF1adir+al+carrito\n'.decode('latin-1')
with codecs.open('lat.txt', mode="wb", encoding='latin-1') as fp:
    fp.write(s)

with codecs.open('lat.txt', "r", encoding='latin-1') as fp:
    r=fp.read()

print s
print r

Note that in python 3, the built-in function open() has an encoding parameter. You don't need to use codecs.open(). Otherwise, use io.open() for cross-python code :)

Python 3.4.0 (default, Jun 19 2015, 14:20:21) 
>>> import codecs
>>> codecs.open is open
False
>>> import io
>>> io.open is open
True

Also, strings in Python 3 are unicode so enocde and decode are not necessary.

Also, strings in Python 3 are unicode so enocde and decode are not necessary.

That's only true if text is already inside Python 3.
Here we are talking about taking text from outside into Python 3,
then we must define a encoding like utf-8,latin-1...,
or it will give an error or become a byte string.

Because we must read with correct encoding when taking text into Python 3,
with open() has new stuff like errors='ignore', errors='replace'

with open('some_file', 'r', encoding='utf-8', errors='ignore') as f:
    print(f.read())

So this statement.
In Python 3 are all strings are sequences of Unicode character
Yes this is true,
but then all text taken in from outside must have been correct encoded into Python 3.

FYI ...

# get your current locale encoding

import locale

print(locale.getpreferredencoding(False))  # eg. US-ASCII

As of Python 3.4.3 these are the options with open() ...
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

I you use a default encoding of None, then your current locale ancoding is applied. Also your default for mode is really 'r' and text 't'

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.