Hi,
I would like to know whether the Data parsing can be done by perl scripting by checking the specified the webpage and fetch the data from the page and load the same i nto the sql server table.

In the sql server table, it checks for a value called AS.Id (For EX:20090403) from one table and should check ou t the web page next value of the AS.Id as 20090404 in the URL and check and load the same similarly it should be done till the AS.ID exists (in the URL).
all the data should be copied into text file, and whcih needs to be loaded in the dB so that all the needed data are in the tables...

Please let me know whether it is posible to do in PERL scripting as I am totally newto PERL.If it can be done , I need to learn the PERL.

Thanks in adv
Sowmya

Thanks for the reply.Could you please let me know any sample code (website ) of similar kind...which could of very great help to me.


Thanks in advance,
Sowmya

The page-scraping is something that Perl excels at. That part, you'll probably code from scratch, as it depends on the structure of the page you're scraping. Start out by dumping your results into plain text, or a flatfile database, until you're getting your data reliably; you'll save time by not tackling two major tasks at once. (I've not checked recently, but there might be CPAN modules that can help you with the page-scraping.)

When you start, save a copy of the target page's source to disk and write code to extract the data from it. Once that's working, you can use wget (found on most Linux distributions; for a Windows port, look around at http://www.gnu.org) to fetch live pages to be scraped.

For the database side, there are CPAN modules you can and should use. Go to http://www.cpan.org and put "SQL", and then "DBI", in the search field -- and read the O'Reilly book "Programming the Perl DBI". That problem's pretty much solved; it's a matter of hooking in the existing working library code and setting it to doing what you want.

You can write up something to import your flatfile database into SQL; then, when that's working, clump the two programs into one to complete the automation of your data collection. Breaking the task down into major components like this will save you a lot of confusion and frustration; I speak from experience.

Try the LWP module, its clean and highly efficient, you just need to make sure that you put in checks to see that you receive the proper data.

This article has been dead for over six months. Start a new discussion instead.