Dear Histrung,
Below is the full codes. I replace some of it as per your advice. Another thing just to add I found out that the packet recieved is in ethernet format so the ip packet start only at offset 14. Can that be a cause here?

int i=0;
    int j=0,line=0,packSize=h->len;
    while(j<packSize/16)
    {
      printf("%06x: ",line++);
        for(i=0;i<16;i++)
        {
          printf("%02x ",p[j*16+1]);
        }
        printf("   |");
        for(i=0;i<16;i++)
        {
           if(isprint(p[j*16+1]))
             printf("%c",p[j*16+1]);
           else
             printf(".");
        }
        printf("|\n");
        j++;
    }

You just need to add the i. I also fixed the line count by doing 16*line. See if that fixes it.

int i=0;
    int j=0,line=0,packSize=h->len;
    while(j<packSize/16)
    {
      printf("%06x: ",16*line++);
        for(i=0;i<16;i++)
        {
          printf("%02x ",p[j*16+i]); // <-- The 1 to i
        }
        printf("   |");
        for(i=0;i<16;i++)
        {
           if(isprint(p[j*16+i]))    // <-- The 1 to i
             printf("%c",p[j*16+i]); // <-- The 1 to i
           else
             printf(".");
        }
        printf("|\n");
        j++;
    }

Dear Histrung,
Looks like it makes better sense now. The thing now I want to understand what this code does. First why do you divide it by /16 at while loop is it cause of hex value? Then what does the next line do.

The 16 is just the number of columns you want to display on a line. I picked 16 as the number of columns because you want it to be divisible by 2 (sizeof(short)), 4 (sizeof(int)) and 8 (sizeof(long long)). It might be more readable if I changed to code below to have 16 replaced by a variable named numColumnToPrint.
See the comments in the code. There is also one more piece missing, see if you can fill in that part.

int i=0,
        numColumnsToPrint=16;  /* Number of columns you want to see on print out line */
    int j=0,line=0,packSize=h->len;

    /* Divide the packet size into chunks of numColumnsToPrint */
    while(j<packSize/numColumnsToPrint)
    {
        /* Want to be able to look at the print out and know what the index is of a 
           byte.  Because when looking at a packet you know that there is a value 
           n bytes into the packet that you want to see what the value is.  Lets 
           take the protocol byte in IPv4 header, it is 10 bytes into the packet.  So,
           using the line below to print out multiples of 16 (or column) you can quickly
           find that value and see its value. 
        */            
        printf("%06x: ",numColumnsToPrint*line++);
        
        /* This loop print the number of columns you specified above in hex format */
        for(i=0;i<numColumnsToPrint;i++)
        {
          /* Print hex value as 2 chars with a zero stuff */
          printf("%02x ",p[j*numColumnsToPrint+i]); 
        }
        /* Print a seperator between the hex values and ASCII values */ 
        printf("   |");
        /* This loop print the number of columns you specified above in ASCII format */
        for(i=0;i<numColumnsToPrint;i++)
        {
           /* If the char is printable, print it */
           if(isprint(p[j*numColumnsToPrint+i]))    
             printf("%c",p[j*numColumnsToPrint+i]); 
           else
             /* else char is not a printable so print a '.', could be whatever you want
                I just like '.' */
             printf(".");
        }
        /* Print an end separator */ 
        printf("|\n");
        /* Increment the number of line printed */
        j++;
    }
    /*         ----> Missing piece <-----
       You still need to add a section here that prints whatever is left over.  As 
       an example.  If the packet size was 15 the while loop would not execute because
       the statement (j < packetSize/16) == (0 < 15/16) == (0 < 0) == false.  The
       reason is integer divide.  So, what you need to do is check to see 
       if (packetSize % numColumnsToPrint) != 0.  If that is true then there is still 
       data left to print. 
    */

Dear Histrung,
So the one which is doing the core job is the isprint which convert the hex into normal char am I right here? I dont get you on the extra part when you said packet size is 15? Correct me here I am using this link to understand http://en.wikipedia.org/wiki/Ethernet_frame. So the first 14 packets is the mac address. In this case I do not see it gets printed when I run the for loop. I want to understand better on the packet size.

The data in the packet is just that, data. When trying to examine it we want to look at it as hex values using the printf("%02x",p[j*numColumnsToPrint+i]); and the ascii values using the printf("%c",p[j*numColumnsToPrint+i]); . The isprint is just a function used to see if the char is a printable one. We need to check this so we are not printing things like '\t' tab, '\n' newline and others. As for the last comment in the code, an example would be best. In the code below I set the packet size to 15. When you compile and run it nothing will print out. If you then change the packet size to 31 and recompile and run. You will only see 16 bytes printed out. That is because 31/16 = 1 so the while loop only executes one time. Thus leaving the other 15 bytes unprinted. Same thing for packet size = 31 and 47. Compile and run the code below and see how it is not printing the whole packet. This is easy to fix, see if you can figure it out.

#include <stdio.h>
#include <ctype.h>
void printPacket(unsigned char *p,int len)
{
   int i,j=0,line=0,colToPrint=16;
   while (j<(len/colToPrint)){
      printf("%06x: ",colToPrint*line++);
      for ( i=0; i<colToPrint; i++){
         printf("%02x ",p[j*colToPrint+i]);
      }
      printf("  |");
      for ( i=0; i<colToPrint; i++){
        if ( isprint(p[j*colToPrint+i]))
          printf("%c",p[j*colToPrint+i]);
        else
          printf(".");
      }
      printf("|\n");
      j++;
   }
}
int main(){
   unsigned char x[256],i;
   for ( i=0; i<255; i++){
      x[i] = i;
   }
   printf("Packet size = 15\n");
   printPacket(x,15);      /* Nothing prints out, should be 15 */
   printf("Packet size = 31\n");
   printPacket(x,31);      /* Only 15 print out, should be 31 */
   printf("Packet size = 47\n");
   printPacket(x,47);      /* Only 32 print out, should be 47 */
   return 0;
}

So the first 14 packets is the mac address

Remember they are just bytes not packets. The first 14 bytes are the two mac addresses (12 bytes) and the protocol (2 bytes). And that is all that we are doing looking at bytes. In one form they are just hex number and the other is if it is a printable char (see table below for hex value to ascii value). Back to the isprint, it is just checking to see if the hex number is 0x20 - 0x7e. Those are the printable chars.
So, '$' is 0x24 and 'A' is 0x41 and 'B' is 0x42 and 'z' is 0x7a and '~' is 0x7e and etc...

The hexadecimal set:

     00 nul   01 soh   02 stx   03 etx   04 eot   05 enq   06 ack   07 bel
     08 bs    09 ht    0a nl    0b vt    0c np    0d cr    0e so    0f si
     10 dle   11 dc1   12 dc2   13 dc3   14 dc4   15 nak   16 syn   17 etb
     18 can   19 em    1a sub   1b esc   1c fs    1d gs    1e rs    1f us
     20 sp    21  !    22  "    23  #    24  $    25  %    26  &    27  '
     28  (    29  )    2a  *    2b  +    2c  ,    2d  -    2e  .    2f  /
     30  0    31  1    32  2    33  3    34  4    35  5    36  6    37  7
     38  8    39  9    3a  :    3b  ;    3c  <    3d  =    3e  >    3f  ?
     40  @    41  A    42  B    43  C    44  D    45  E    46  F    47  G
     48  H    49  I    4a  J    4b  K    4c  L    4d  M    4e  N    4f  O
     50  P    51  Q    52  R    53  S    54  T    55  U    56  V    57  W
     58  X    59  Y    5a  Z    5b  [    5c  \    5d  ]    5e  ^    5f  _
     60  `    61  a    62  b    63  c    64  d    65  e    66  f    67  g
     68  h    69  i    6a  j    6b  k    6c  l    6d  m    6e  n    6f  o
     70  p    71  q    72  r    73  s    74  t    75  u    76  v    77  w
     78  x    79  y    7a  z    7b  {    7c  |    7d  }    7e  ~    7f del

Dear Histrung,
Let me fix the fundamentals first each of this p[j*colToPrint+i] is one byte rite because each hex is represented as 4 bits rite. So the first 14 will be the mac and protocol. So for data I should only look 14 but then from 14 onwards is the ip headers first. So that will take another 20 bytes right. So for pure payload should be 34 onwards am I right?

is one byte right? Each hex (digit) is 4 bits?

Yes and Yes.

first 14 will be the mac(s) and protocol

Yes, 6 bytes for source mac and 6 bytes for the destination mac and 2 bytes for the protocol.

So for data I should only look 14

If you only want to see the MACs and protocol, yes.

ip header is 20 bytes

Yes, if IPv4 and not optional fields (the lower nibble of the first byte of the IPv4 header tells you what the header length is in words (4 bytes))

pure payload should be 34 onwards am I right?

No, not really. You still have the protocol header, like TCP or UDP or whatever it is in byte 10 of the IPv4 header. Then when you pass that header if there would be the "raw data".

Dear Histrung,
So which index should be the "raw data"? It also change according to different types of protocol right?

This is how you would find the "raw data" of the packet. I think I got my values correct.
Let us start with some assumptions.

  1. The data link layer (layer 2) is 802.3 Ethernet http://en.wikipedia.org/wiki/Ethernet_frame
  2. There are no optional fields in the 802.3 Ethernet frame structure
  3. The Ethertype is 0x8000 which is IP http://en.wikipedia.org/wiki/Ethertype
  4. The "raw data" is the payload of the protocol contained in the IPv4 header
  5. The IP version is 4
  6. The protocol under the IP is TCP or UDP
  7. When I use the term packet it is the whole Ethernet frame

With those things true, here are the steps to get to the "raw data"

  1. Look at byte 14 (call this number ipStartOffset) of the packet, it should be 0x45
    1. The IP version is the upper nibble (4-bits) of that byte. Should be 4 for this example
    2. The lower nibbel (4-bits) is the length of the IP header in increments of 4 bytes (call that ipHeaderLen). It will be 5 for most of the time.
  2. Look at the byte 24 (ipStartOffset + ipProtoOffset (10)) to see what the protocol is following the IPv4 header http://en.wikipedia.org/wiki/List_of_IP_protocol_numbers
    1. For TCP it will be 0x06
    2. For UDP it will be 0x11
  3. TCP or UDP?
    1. TCP
      1. Go to byte 46 ( ipStartOffset+ipHeaderLen*4+tcpHeadLenPos (12))and look at the upper nibble (4-bits), which is the length of the TCP header in increments of 4 bytes (call this tcpHeaderLen)
      2. "Raw data" starts at byte 54 (ipStartOffset+ipHeaderLen*4+tcpHeaderLen*4)
    2. UDP, I will leave for you to explore

Explore my offsets with your packets.

Dear Histrung,
First let me thank you for the in depth explanation. Byte 54 correct me for my explanation first we look into the ethernet's payload then int it the IPV4 packet then in the IPV4's payload is the TCP Packet. That is how you derive right. I am confuse is the link between data link layer (layer 2) is 802.3 Ethernet and Ethertype? What does the ethernet type represent is that each type have a different frame configuration because I look at the preamble of both does not tally. Another thing to confirm with you tcpHeadLenPos (12))and look at the upper nibble (4-bits), this is part of the Sequence Number section of the TCP right?

That is how you derive right.

Yes.

link between data link layer (layer 2) is 802.3 Ethernet and Ethertype?

The Ethertype for 802.3 tells you what is inside the Ethernet packet. This this case the value of the Ethertype is going to be 0x0800 which tells you inside the Ethernet packet is IPv4. See http://en.wikipedia.org/wiki/Ethertype for all different type for Ethertype
Note: My first post about how to extract I made a typo I said Ethertype of 0x8000, it should have been 0x0800.

because I look at the preamble of both does not tally

What is the start of your packet? Is it MAC destination or can you see the preamble and start of frame delimiter?
If it is MAC destination that offsets are correct. If preambles, then we would need to add the value of the preambles to offsets.

this is part of the Sequence Number section of the TCP right?

No, from the start of the TCP header the sequence number is at offset 4. The data length is at the upper nibble of offset 12. See section 'TCP segment structure' on page http://en.wikipedia.org/wiki/Transmission_Control_Protocol

Dear Histrung,
Ok the rest looks pretty much clear only one thing I need to confirm with about the TCP header length are u looking the Data Offset but that only got 3 bits?

4, using the pic from the link.

byte bits  0123 4567 89012345
  12   96 |****|Resv|........|

The **** above is the Data offset in the TCP link.

Dear Histrung,
Ok sorry I got it ready. So this tells us the TCP header length right. So based on that then you arrive to "raw data".

Yes, that is the length of the TCP header in multiples of words (4 bytes). So the length in bytes of the TCP header is 4*(Data Offset).
Yes, once you add all of the offsets and header lengths you would get to the "raw data" in the packet.

You should download wireshark. http://www.wireshark.org/
Use it to capture and analyes packets. It is easy to see where all of the different parts of the packets.
It will help you understand all of the different packet structures.

Dear Histrung,
Yes I think I will have to do that to get the details. I do not know how to work this wireshark on linux will work it out and update you on that too.

Dear Histrung,
I think I will try to run first wireshark as I have full access is just a test machine so not a problem at all.

How Wireshark displays the information is done very well. I think it will help you greatly.

Dear Histrung,
I have done some simple analysis between wireshark and our for loop some of the data interpretation is exactly same with wireshark but some have difference where some symbol are different in our for loop in comparison to wireshark output.

For the differences check to see what the protocol is in the Ethertype and in the IPv4 protocol. The example I explained was for just TCP. You might be seeing UDP, ARP, ICMP and etc...

Dear Histrung,
You are quite right the difference is mostly for UDP packet but in wireshark is shown as DNS. So what should be done rectify this then?

If the protocol in the IPv4 header is UDP then to find the length of the UDP header goto position ipStartOffset+ipHeaderLen*4+udpHeadLenPos which udpHeadLenPos=4 (see http://en.wikipedia.org/wiki/User_Datagram_Protocol). We will call the value you read from that offset udpHeaderLen.
Then the "raw data" starts at offset ipStartOffset+ipHeaderLen*4+udpHeaderLen

Dear Histrung,
Give me some time I doing more test to see where the exact difference cause it and will update you too.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.