Read HTTP Headers from C program
I need to write a small application which can be used to store html from any web server. I have written some code that is able to fetch HTML page along with header. I need to know is there any way to distinguish these headers from actual html content. Can I have the way to use this header to know file name and its size etc. which is already there in Headers.
Thanks In Advance :) .
shaikh_mshariq
Junior Poster in Training
71 posts since Mar 2006
Reputation Points: 12
Solved Threads: 1
You mean the same way that browsers understand the difference between the header and content?
http://www.rfc-editor.org/
Find the RFCs for HTTP and read.
Salem
Posting Sage
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
Yes same as browser does. I am using the same request-response method. Using socket I am writing GET request on the on port 80 and waiting for the reply. I am able to get HTTP Header and Response from the socket.
Sample GET request would be
GET /Scripts.htm HTTP 1.1\r\n\n
Is this is the expected method to get the content ?
Thanks Again
shaikh_mshariq
Junior Poster in Training
71 posts since Mar 2006
Reputation Points: 12
Solved Threads: 1
Like I said, read the RFCs and be informed.
Salem
Posting Sage
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953
Thanks for reply. It means I need to write routine which handle HTTP transfers from scratch. Does any body has work on this ? Can I have a header file or library to handle this HTTP transfers. Thanks Again for your reply :) .
shaikh_mshariq
Junior Poster in Training
71 posts since Mar 2006
Reputation Points: 12
Solved Threads: 1
I need to write a small application which can be used to store html from any web server. I have written some code that is able to fetch HTML page along with header. I need to know is there any way to distinguish these headers from actual html content. Can I have the way to use this header to know file name and its size etc. which is already there in Headers.
Of course, anybody had worked on this: anybody wrote your browser, anybody wrote DaniWeb web-server engine. All these beasts generate and parse http headers ;)
HTTP message header lines are simple text lines: what's additional header do you want to process text lines? An empty line (CRLF) terminates header fields.
See: http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
Also you may download dlib C++ library:
http://dclib.sourceforge.net/
There is an example in the dlib distribution: a simple http-server. Read the code, run this example: it shows all http message header lines from your request.
The most interesting counter-question: are you sure that you say about HTTP message header? Or it's HTML header? Feel the difference (by the way, no HTTP word in your help request message body) ;)...
ArkM
Postaholic
2,001 posts since Jul 2008
Reputation Points: 1,234
Solved Threads: 348
Salem
Posting Sage
11,531 posts since Dec 2005
Reputation Points: 5,862
Solved Threads: 953