nadiam 0 Posting Pro in Training

Hello. I found this packet sniffer code that uses the pycap wrapper. Initially, its for live capture but I've changed it to read a .pcap file instead and im trying to understand it but some parts of it i can't i comprehend. could someone explain them to me please?

the full code is:

import socket
from struct import pack, unpack
import pcapy
import sys

def main(argv):
    dev = input("Enter file name to sniff : ")

    print("Sniffing file " + dev)

    # Read offline
    cap = pcapy.open_offline(dev)

    #start sniffing packets
    while(1) :
        (header, packet) = cap.next()
        parse_packet(packet)

# change to string format 00:00:00:00:00:00
def eth_addr (a) :
    b = "%.2s:%.2s:%.2s:%.2s:%.2s:%.2s" % (str(a[0]) , str(a[1]) , str(a[2]), str(a[3]), str(a[4]) , str(a[5]))
    return b

def parse_packet(packet) :

    # Ethernet header
    eth_length = 14

    eth_header = packet[:eth_length]
    eth = unpack('!6s6sH' , eth_header)
    eth_protocol = socket.ntohs(eth[2])
    with open("file.txt", "a") as file1:
        file1.write("Ethernet Header : \n")
        file1.write("Destination MAC : " + str(eth_addr(packet[0:6])) + "\n" + "Source MAC : " + str(eth_addr(packet[6:12])) + "\n" + "Protocol : " + str(eth_protocol) + "\n\n")

    #Parse IP packets, IP Protocol number = 8
    if eth_protocol == 8 :
        #Parse IP header
        #take first 20 characters for the ip header
        ip_header = packet[eth_length:20+eth_length]

        #now unpack 
        iph = unpack('!BBHHHBBH4s4s' , ip_header)

        version_ihl = iph[0]
        version = version_ihl >> 4
        ihl = version_ihl & 0xF

        iph_length = ihl * 4

        ttl = iph[5]
        protocol = iph[6]
        s_addr = socket.inet_ntoa(iph[8]);
        d_addr = socket.inet_ntoa(iph[9]);

        with open("file.txt", "a") as file2:
            file2.write("IP Header : \n")
            file2.write("Version : " + str(version) + "\n" + "IP Header Length : " + str(ihl) + "\n" + "TTL : " + str(ttl) + "\n" + "Protocol : " + str(protocol) + "\n" + "Source Address : " + str(s_addr) + "\n" + "Destination Address : " + str(d_addr) + "\n\n")

        #TCP protocol
        if protocol == 6 :
            t = iph_length + eth_length
            tcp_header = packet[t:t+20]

            #now unpack
            tcph = unpack('!HHLLBBHHH' , tcp_header)

            source_port = tcph[0]
            dest_port = tcph[1]
            sequence = tcph[2]
            acknowledgement = tcph[3]
            doff_reserved = tcph[4]
            tcph_length = doff_reserved >> 4

            with open("file.txt", "a") as file3:
                file3.write("TCP Header : \n")
                file3.write("Source Port : " + str(source_port) + "\n" + "Dest Port : " + str(dest_port) + "\n" + "Sequence Number : " + str(sequence) + "\n" + "Acknowledgement : " + str(acknowledgement) + "\n" + "TCP header length : " + str(tcph_length) + "\n\n")

            h_size = eth_length + iph_length + tcph_length * 4
            data_size = len(packet) - h_size

            #get data from the packet
            data = packet[h_size:]

            with open("file.txt", "a") as file4:
                file4.write("Data : " + str(data) + "\n\n")

if __name__ == "__main__":
  main(sys.argv)

eth = unpack('!6s6sH' , eth_header) this line gives an error : struct.error: unpack requires a buffer of 14 bytes. From python docs the !6s6sH is the character format. ! is network (= big-endian), s is for string and H is for integer. And its suppose to unpack according to that format. I think, i sorta understand that bit except i don't know why there is an error.

Destination MAC : " + str(eth_addr(packet[0:6])) + "\n" + "Source MAC : '" + str(eth_addr(packet[6:12])) for these addresses I've found something weird in in the results. like either address could be 25:25:25:25:25:25 or even 0:1:2:3:4:5 and 6:7:8:9:10:11 respectively. so i really don't know whats going on here. Even for the ip address, most of the address displayed correct values however i did find a few that were 255:255:255:255 for Destination Address. is it some kind of conversion problem or encryption thing or something?

why is iph_length = ihl * 4 and h_size = eth_length + iph_length + tcph_length * 4 times 4? and im not sure if the calculation bit is printing the right thing.