954,541 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Decoding binary file

Hi,

I'm new to Python and I was given an unknown (supposedly binary) to read. The problem is that I am not sure on how to real the 'actual' data. I have opened the file as:

f = open('/home/mmeclimate/test/test.dat', mode='rb')
f.readline()


The output from readline is:

'MO00\x14\x00\x00\x00\x00\x00\x00\x00\\\x00\x00\x00\xb8\x00\x00\x00\x14\x01\x00\x00p\x01\x00\x00\xcc\x01\x00\x00(\x02\x00\x00\x86\x02\x00\x00\xe4\x02\x00\x00B\x03\x00\x00\xa0\x03\x00\x00\xfe\x03\x00\x00\\\x04\x00\x00\xba\x04\x00\x00\x18\x05\x00\x00v\x05\x00\x00\xd2\x05\x00\x00.\x06\x00\x00\x8a\x06\x00\x00\xf2\x06\x00\x00^\x07\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x05 \x00\x00\x02\x00\x00\x00t\x00\x00\x00\x08\x00\x0e\x00\x00\x004\x00.\x00 \x00S\x00A\x00 \x000\x008\x00.\x001\x000\x00.\x000\x001\x00\n


This is only one line of the file, but how do I decode it? And is there a way to automatize that?

Thanks in advance!

Mike

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

The output of
od -Ax -t x1z /home/mmeclimate/test/test.dat

is my preferred way of dumping binary data in an unambiguous textual representation.

Salem
Posting Sage
Team Colleague
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
 

Hi,

Thank you so much for the help! Is there a 'pythonic' way of doing the same thing, for example, as in the output of:

od -t cx1 test.dat

Thanks!

Michael

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

There are some conversion functions in the module binascii . For example you can convert to readable hex format with binascii.hexlify . I don't know if it's what you're looking for.

Gribouillis
Posting Maven
Moderator
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
 

I'm assuming you want to do more with it than just print it.

Salem
Posting Sage
Team Colleague
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
 
I'm assuming you want to do more with it than just print it.


I just want to be able to read the data from the file to use it in a new database.

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 
There are some conversion functions in the module binascii . For example you can convert to readable hex format with binascii.hexlify . I don't know if it's what you're looking for.


But how would I use the binascii module to 'decode' my data, for example, if I wanted to decode the output I provided in my first post to this thread?
Thanks,

Michael

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

I thought you said the format was unknown to you?

Reading it in isn't the problem.
Working out what all the bytes mean is.

Without a nice clean dump of some of the data, figuring that bit out will be next to impossible.

Do you have any idea what the data is?

Or do we take a guess based on your chosen username and first post?
http://clusty.com/search?query=mme+climate&sourceid=Mozilla-search

Salem
Posting Sage
Team Colleague
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
 

I might be fishing here, but is this a Macintosh binhex4 format file?

vegaseat
DaniWeb's Hypocrite
Moderator
5,989 posts since Oct 2004
Reputation Points: 1,345
Solved Threads: 1,417
 

I thought you said the format was unknown to you?

Reading it in isn't the problem. Working out what all the bytes mean is.

Without a nice clean dump of some of the data, figuring that bit out will be next to impossible.

Do you have any idea what the data is?

Or do we take a guess based on your chosen username and first post? http://clusty.com/search?query=mme+climate&sourceid=Mozilla-search


Hi,

It is a climate dataset given as codes. The problem is that I was not given the information on how the data is arranged in the file.
But let's suppose the data is arranged like this (hypothetically):

variable value units
temp 23 C
press 1001 mb

And if I open the file using 'od -c file.dat'. How do I join the output to be able to extract the dataset?

Thanks,

Michael

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

Well the output of od would enable you to confirm such things as
- whether 2 or 4 byte integers were used
- whether 4 or 8 byte floats were used
- whether the data is stored in big-endian or little-endian format
- whether there is a header, say perhaps containing the number of records.
and so on.

If you know (or at least have some idea) what the format is, then you can start writing code to deal with it.

Do you have an application which also reads this file, and can display the results?

Salem
Posting Sage
Team Colleague
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
 

Well the output of od would enable you to confirm such things as - whether 2 or 4 byte integers were used - whether 4 or 8 byte floats were used - whether the data is stored in big-endian or little-endian format - whether there is a header, say perhaps containing the number of records. and so on.

If you know (or at least have some idea) what the format is, then you can start writing code to deal with it.

Do you have an application which also reads this file, and can display the results?


Hi,

Thanks for the help! I'll contact the person who gave me the data for more info. But what you've told me here was very helpful! Thank you so much!
I have different application to display the results, what I just need is to get the data properly. I'll work on that!

Thanks!

Michael

mmeclimate
Newbie Poster
10 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You