So I'm trying to make a parser for a game replay.
I just need help getting started, I have docs for it explaining.

Atm I'm just trying to get the header parsed.

replay = open(r'C:\Users\CookieMonster\Desktop\mix.w3g', 'rb')

#offset | size | description
print( #0x0000 | 28 chars  | zero terminated string "Warcraft III recorded game\0x1A\0
print( #0x001c |  1 dword  | fileoffset of first compressed data block (header size)
print( #0x0020 |  1 dword  | overall size of compressed file
print( #0x0024 |  1 dword  | replay header version:
print( #0x0028 |  1 dword  | overall size of decompressed data (excluding header)
print( #0x002c |  1 dword  | number of compressed data blocks in file

I get the output

b'Warcraft III recorded game\x1a\x00'

How do I parse these into something I can understand.
So for example the second dword (size of compressed file) how to change the binary into int.
I tried googling and got something suggesting int(, 16) but it doesnt work. ValueError: invalid literal for int() with base 16: b'\xfby\x0b\x00'
Later on I'll have to change hex into string so knowing how to do that would be useful too.
The bytes object docs didnt offer much help either.

Also why does reading 4 bytes someone give back 4 \x..\x..\x.. and sometimes only 3.

Edited 6 Years Ago by Enders_Game: n/a

You want to look at the struct module which allows you to parse arbitrary sequence of binary types. You need to be aware of big-endian versus little-endian binary layout in the file.

This article has been dead for over six months. Start a new discussion instead.