import os
data = map(lambda c: ord(c), file(args[0]).read(os.path.getsize(args[0])))

For one file, os.path.getsize(args[0]) returns 10456 while len(data) returns 281.
After looking at different files, I realized that it always stops reading at the 0x1A character.
The documentation says that in Windows, Python uses _wfopen which (for compatibility reasons) interprets 0x1A (CTRL-Z in DOS) as the end-of-file.

Does anyone know how to read an entire binary file which may contain 'x1A'?

Recommended Answers

This post says to open the file in mode 'rb' or 'rU'.

Jump to Post

All 6 Replies

This post says to open the file in mode 'rb' or 'rU'.

I already tried 'rb'. Just tried 'rU' but it didn't work either.
Here is more of the code related to this issue:

 open(args[0], 'rb')
 import os
 filelength=os.path.getsize(args[0]) #gives correct file size
 data = map(lambda c: ord(c), file(args[0]).read())
 mdebug(5, "File is %(filelen)d, Data is %(len)d bytes" % {'filelen': filelength, 'len': len(data)})

Problem solved. Code now reads:

            f=open(args[0], mode='rb') 
            import os
            data = map(lambda c: ord(c),

and works fine (not sure if whole program works on Windows yet, but this particular problem is solved).
I saw the post referred to in Gribouillis's response, but I didn't implement the suggestion correctly.

Notice that

from struct import unpack
s =
data = list(unpack("%dB" % len(s), s))

is much faster to create the data.

With the advent of Python3 your life is easier:

# test binary file read 

fname = "ball.png"
with open(fname, mode='rb') as f:
    data =


result with Python2 >>>
<type 'str'>
result with Python3  >>>
<class 'bytes'>
commented: indeed! +13

It's a very good remark, in python 3, the bytes type is already a sequence of integers

>>> s = bytes("hello", encoding="utf8")
>>> s
>>> s[0]
>>> s[1]
>>> list(s)
[104, 101, 108, 108, 111]

In python 2, there is also an array type

>>> import array
>>> x = array.array("B", "hello")
>>> list(x)
[104, 101, 108, 108, 111]
>>> x
array('B', [104, 101, 108, 108, 111])
Be a part of the DaniWeb community

We're a friendly, industry-focused community of 1.21 million developers, IT pros, digital marketers, and technology enthusiasts learning and sharing knowledge.