import os
data = map(lambda c: ord(c), file(args[0]).read(os.path.getsize(args[0])))

For one file, os.path.getsize(args[0]) returns 10456 while len(data) returns 281.
After looking at different files, I realized that it always stops reading at the 0x1A character.
The documentation says that in Windows, Python uses _wfopen which (for compatibility reasons) interprets 0x1A (CTRL-Z in DOS) as the end-of-file.

Does anyone know how to read an entire binary file which may contain 'x1A'?

Recommended Answers

All 6 Replies

This post says to open the file in mode 'rb' or 'rU'.

I already tried 'rb'. Just tried 'rU' but it didn't work either.
Here is more of the code related to this issue:

 open(args[0], 'rb')
 import os
 filelength=os.path.getsize(args[0]) #gives correct file size
 data = map(lambda c: ord(c), file(args[0]).read())
 mdebug(5, "File is %(filelen)d, Data is %(len)d bytes" % {'filelen': filelength, 'len': len(data)})

Problem solved. Code now reads:

            f=open(args[0], mode='rb') 
            import os
            filelength=os.path.getsize(args[0])
            data = map(lambda c: ord(c), f.read())

and works fine (not sure if whole program works on Windows yet, but this particular problem is solved).
I saw the post referred to in Gribouillis's response, but I didn't implement the suggestion correctly.

Notice that

from struct import unpack
s = f.read()
data = list(unpack("%dB" % len(s), s))

is much faster to create the data.

With the advent of Python3 your life is easier:

# test binary file read 

fname = "ball.png"
with open(fname, mode='rb') as f:
    data = f.read()

print(type(data))

'''
result with Python2 >>>
<type 'str'>
result with Python3  >>>
<class 'bytes'>
'''
commented: indeed! +13

It's a very good remark, in python 3, the bytes type is already a sequence of integers

>>> s = bytes("hello", encoding="utf8")
>>> s
b'hello'
>>> s[0]
104
>>> s[1]
101
>>> list(s)
[104, 101, 108, 108, 111]

In python 2, there is also an array type

>>> import array
>>> x = array.array("B", "hello")
>>> list(x)
[104, 101, 108, 108, 111]
>>> x
array('B', [104, 101, 108, 108, 111])
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.