0

Hi Guys,

After trying to port a C++ program which was a console application where
it crawled the forums with the url provided and in the end stored the result inside
a database for further analysis.

Now, with very limited time I have decided to replicate this in vb.net as I have come
across few functions and classes which are much easier to use.

Since I will be replicating the c++ application, I will be following the same design as mentioned below:

Frist: Initiate forum connect
Second: Get list of forums and information from database
Third: Read individual forum URL and associated information
Forth: Instantiate and run crawler for forum
Fiveth: Forums remaining? No then go back to step 3 or else continue
Sixth: Close connection
Seventh: Exit Program

Anyhow, I am looking for advice on how I can go about downloading contents of say a thread inside a forum and storing it to a database where the contents can be parsed for specific information.

Please advice

Thanks

2
Contributors
2
Replies
3
Views
6 Years
Discussion Span
Last Post by rEhSi_123
0

The .NET framework has classes for making HTTP requests (System.Net namespace) and an open source library - "Html Agility Pack" is available to parsing html.

0

The .NET framework has classes for making HTTP requests (System.Net namespace) and an open source library - "Html Agility Pack" is available to parsing html.

Thanks for your reply :)

I have no problem grabbing source code from a given website but my main
concern is some of them have xml codding in them so I am not sure how
to work around designing the parser to work with both html or xml?

Edited by rEhSi_123: n/a

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.