954,541 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Data fetch from web page

Hi,
I would like to know whether the Data parsing can be done by perl scripting by checking the specified the webpage and fetch the data from the page and load the same i nto the sql server table.

In the sql server table, it checks for a value called AS.Id (For EX:20090403) from one table and should check ou t the web page next value of the AS.Id as 20090404 in the URL and check and load the same similarly it should be done till the AS.ID exists (in the URL).
all the data should be copied into text file, and whcih needs to be loaded in the dB so that all the needed data are in the tables...

Please let me know whether it is posible to do in PERL scripting as I am totally newto PERL.If it can be done , I need to learn the PERL.

Thanks in adv
Sowmya

sowmyav
Newbie Poster
5 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

I sounds possible. Good luck.

KevinADC
Posting Shark
921 posts since Mar 2006
Reputation Points: 246
Solved Threads: 67
 

Thanks for the reply.Could you please let me know any sample code (website ) of similar kind...which could of very great help to me.


Thanks in advance,
Sowmya

sowmyav
Newbie Poster
5 posts since Dec 2008
Reputation Points: 10
Solved Threads: 0
 

The page-scraping is something that Perl excels at. That part, you'll probably code from scratch, as it depends on the structure of the page you're scraping. Start out by dumping your results into plain text, or a flatfile database, until you're getting your data reliably; you'll save time by not tackling two major tasks at once. (I've not checked recently, but there might be CPAN modules that can help you with the page-scraping.)

When you start, save a copy of the target page's source to disk and write code to extract the data from it. Once that's working, you can use wget (found on most Linux distributions; for a Windows port, look around at http://www.gnu.org ) to fetch live pages to be scraped.

For the database side, there are CPAN modules you can and should use. Go to http://www.cpan.org and put "SQL", and then "DBI", in the search field -- and read the O'Reilly book "Programming the Perl DBI". That problem's pretty much solved; it's a matter of hooking in the existing working library code and setting it to doing what you want.

You can write up something to import your flatfile database into SQL; then, when that's working, clump the two programs into one to complete the automation of your data collection. Breaking the task down into major components like this will save you a lot of confusion and frustration; I speak from experience.

crb3
Light Poster
25 posts since May 2009
Reputation Points: 10
Solved Threads: 6
 

Try the LWP module, its clean and highly efficient, you just need to make sure that you put in checks to see that you receive the proper data.

kenji
Junior Poster
145 posts since May 2008
Reputation Points: 11
Solved Threads: 11
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You