I have a char all[1500]; which will store each network packet. I will convert the hex value into humand readable. Finally I need to extract the url. For e.g. I have this GET /mail/ HTTP/1.1\r\n. I know I can use first to decide if the word GET and HTTP/1.1. exist using this. So I am stuck how to extract the word say in given example "/mail/" ?

char *getPointer = strstr (all, "GET"); 
 char *httpPointer = strstr (all, "HTTP/1.");
if (strstr (all, "GET") !==NULL && strstr (all, "HTTP/1."))

Recommended Answers

All 12 Replies

The ideal would be to properly parse the HTTP protocol format, but for this specific case it's straightforward to extract the address (error handling omitted for brevity):

#include <stdio.h>
#include <string.h>

int main(void)
{
    const char *request = "GET /mail/ HTTP/1.1\r\n";
    const char *paddr_begin = request  + 4;
    const char *paddr_end = strstr(paddr_begin, "HTTP/1.") - 1;
    char addr[BUFSIZ];

    addr[0] = '\0';
    strncat(addr, paddr_begin, paddr_end - paddr_begin);

    printf(">%s<\n", addr);

    return 0;
}
Dear Deceptikon,
               Any hint on how to parse properly the HTTP protocol ? First you look for port 80 to decide if its a HTTP protocol right? What type of error handle must I put in ? What is the best buffer size you suggest just to keep the url will 256 be sufficient?

Any hint on how to parse properly the HTTP protocol ?

Study the HTTP protocol defined in RFC 2616, and write code that meets all of the requirements. Parsing can be a deep topic, but depending on your needs it can be relatively simple. For example, here's a basic parser for the request line:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define length(array) (sizeof (array) / sizeof *(array))

typedef struct request_line {
    char *method;
    char *uri;
    char *version;
} request_line;

char **split(const char *s, const char *delim, size_t *n)
{
    char *copy = _strdup(s), *tok;
    char **result = NULL;

    *n = 0;

    for (tok = strtok(copy, delim); tok != NULL; tok = strtok(NULL, delim)) {
        result = realloc(result, ++(*n) * sizeof *result);
        result[*n - 1] = _strdup(tok);
    }

    return result;
}

request_line parse_request_line(const char *s)
{
    request_line result = {"", "", ""};

    const char *methods[] = {
        "OPTIONS", "GET", "HEAD", "POST", "PUT", "DELETE", "TRACE", "CONNECT"
    };
    size_t n = 0, i;
    char **parts = split(s, " ", &n);

    if (n != 3) {
        // Invalid format, missing one or more parts
        return result;
    }

    for (i = 0; i < length(methods); i++) {
        if (strcmp(_strupr(parts[0]), methods[i]) == 0) {
            break;
        }
    }

    if (i == length(methods)) {
        // Unrecognized method
        return result;
    }

    // More validation can be added here

    result.method = parts[0];
    result.uri = parts[1];
    result.version = parts[2];

    return result;
}

int main(void)
{
    request_line request = parse_request_line("GET /mail/ HTTP/1.1\r\n");

    printf("HTTP Version: %s\n", request.version);
    printf("Method: %s\n", request.method);
    printf("URI: %s\n", request.uri);

    return 0;
}

Once again, error handling has been omitted. I also didn't add the release of allocated memory or best practice for realloc to keep the example short.

What type of error handle must I put in ?

Just your run of the mill error handling, like length checks, null pointer checks, successful memory allocation and release, etc...

What is the best buffer size you suggest just to keep the url will 256 be sufficient?

The RFC doesn't specify a maximum length, so you'd be wise not to add any dependency. Dynamically allocate enough space to hold the request line and the above code will handle it safely.

@newbie14 Is this related to our dialog from a few months ago?

Are you continuing from where you left off or starting an entirely new direction?

Dear Deceptikon,
                I think the example you gave me is quite similar is just that you are taking in the full list of methods not just purely get and post. I will study it further. Thank you.
Dear L7Sqr,
           Yes is some what similar but just a very small script to settle some small portion of another problem.

I think the example you gave me is quite similar is just that you are taking in the full list of methods not just purely get and post.

Of course it's similar because it's doing roughly the same thing. The difference is that instead of just taking a single request line and parsing it directly, you're programming against a standard document with semantic validation as well as syntactic validation.

Dear Deceptikon,
                I dont get you when you say this "programming against a standard document with semantic validation as well as syntactic validation." What is actually the big mistake in my solution please advice so I can avoid it?

What is actually the big mistake in my solution please advice so I can avoid it?

There's no big mistake, but you can do it better as per my examples. "Better" in this case meaning more flexible and easier to match the HTTP standard.

Dear Deceptikon,
               Ok thank you for the confirmation give me some time to study your codes I am not too good with those new stufff. Based on your codes I am going to send the whole packet rite. Then how to work on this part  if (strcmp(_strupr(parts[0]), methods[i]) == 0) ? Becase before this your split based on space ?

Based on your codes I am going to send the whole packet rite.

My example parses only the request line defined here.

Dear Deceptikon,
               Thank you for the link will study it further.
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.