I'm currently developping an app that is going through all the files on a server and checking every single hrefs to check wether they are valid or not. Using a WebClient or a HttpWebRequest/HttpWebResponse is kinda overkilling the process because it downloads the whole page each time, which is useless, I only need to check if the link do not return 404.

What would be the most efficient way? Socket seems to be a good way of doing it, however I'm not quite sure how this works.

Thanks for sharing your expertise!

Recommended Answers

All 3 Replies

Yeah a socket can be used.

You simply create a TCP Connection to port 80 and issue a request for the page.

This Site tells you what the form of the message is.

Scan the response header and see if it has 404 set. If not, you can discard the rest (break the connection) and move to the next part.

Do you think this would be faster as something like this ( which I currently use )

var request = WebRequest.Create("http://google.com/");
request.Method = "HEAD";
using (var response = (HttpWebResponse)request.GetResponse())
{
    if (response.StatusCode == HttpStatusCode.OK)
    {
        // 200 OK
    }
}

Your version is much better :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.