0

Hi, i wasn't sure about the title of my question, hope its okay. Anyway, the program i have is suppose to read a binary file then find for specific bytes like this:

bytes1 = bytearray(b'\x41\x64\x6F\x62\x65\x20')
filename = "portrait1.dng"
with open(filename, "rb") as binaryfile:
    with open("foundhex.txt", "w") as found:
        found.write("File that is analysed : " + filename + "\n")
        found.write("Date of analysis : " + str(today) + "\n")
    while True:
        read_file = bytearray(binaryfile.read(1024))
        find_bytes1 = read_file.find(bytes1, 0)
        if fine_bytes1 != -1:
            with open("foundhex.txt", "a") as found1:
                found1.write("Found 41646F626520 at : " + str(find_bytes1) + "\n")
        if not read_file:
            break

basically, it finds the bytes then writes the positions. i checked the file that is being read using a hex editor and the bytes (bytes1) that i am looking for has 12 occurences but only 9 occurences of it are "found". so now im confused. is my program not reading the entire file, thats why only 9 found? or is there something wrong with my code? as of right now im only using a 16.2mb file but later on ill be using a 8gb file. is there a difference for file sizes when reading in chunks? because i ended up changing the "size" to random numbers and found that read(901) found 11 occurences, not only 9. haven't hit 12 yet though. Please, could someone explain this to me. thank you in advance.

2
Contributors
2
Replies
20
Views
1 Week
Discussion Span
Last Post by nadiam
1

if not read_file:This statement is executed when bytes1 is found at the beginning of the file, offset/read_file==0). Use instead

      with open("foundhex.txt", "a") as found1:
          while True:
              read_file = bytearray(binaryfile.read(1024))
              if len(read_file):
                  find_bytes1 = read_file.find(bytes1, 0)
                  if fine_bytes1 != -1:
                      found1.write("Found 41646F626520 at : " + str(find_bytes1) + "\n")
              else:
                  break

Also this statement find_bytes1 = read_file.find(bytes1, 0)
starts at the beginning every time, so you are finding only the first sting and not any subsequent strings. Finally, for this statement read_file = bytearray(binaryfile.read(1024))
what happens if half of bytes1 is in one read, and half is in the next read?

Edited by woooee

0

thank you for your reply woooee. ive ended up with this code, using re.finditer

import re

with open(filename, "rb") as binaryfile:
    while True:
        read_file = binaryfile.read()
        if len(read_file):
            for find_bytes in re.finditer(bytes1, read_file):
                with open("foundhex.txt", "a") as found1:
                    found1.write("Found bytes at : " + str(find_bytes.start()) + " " +str(find_bytes.end()) + "\n")
        else:
            break

it finds all occurences of bytes1

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.