0

I am an undergraduate Student, from Computer Science and engineering department

I can construct a crawler in Perl, for one particular web-site to fetch the useful information, in my case the - Job Ads at that company's webpage.

Now, I want to construct some crawler that is generalized for say around 100 companies, using Perl

How can I do it ? I need some ideas/code/resource... and Do I need to study all 100 HTML codes?

Regards,
Kunal

2
Contributors
3
Replies
4
Views
8 Years
Discussion Span
Last Post by hopewemakeit
0

Look into

HTML::Parse

. It is event driven and tag driven. You create functions when a tag opens or closes and how to deal with it. Play around with it and see if you can more effectively parse HTML with it.

0

hi,
thanks for the post

I have made crawlers for one web-site and it really is based on the Job-portal on that site and its HTML coding.. as in , like for what HTML tag opens and closes, and accordingly the data retrieval.,in between them (the one i need)

But I really cant figure it out, there are 100 web pages before me and I need to create a common scraper and all the HTML codes/tags are different.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.