i try to make a crawler that crawls a web page & retrieves
the stock information from google,but can't do it .
so plz help me 2 make that type of crawler.
urgent plz...

Hi Arun.N,

Sounds like your looking for cURL... Have a look at the cURL documentation and see what you think.

cURL + regular expressions (preg_match_all) = exactly what your looking for.

I've written a few of these "crawlers" myself, so I'll include some foundational code for a very a simple one for you:

<?php

// Return a handle to a curl connection to the site you want to pull info from
$ch = curl_init('http://finance.google.com/finance');

// Set some options for the connection
curl_setopt($ch,CURLOPT_HEADER,0); // Don't return header information, although, this can be handy ;)
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); // Give us the page source

// Open the connection with the options specified
$cr = curl_exec($ch);

// Run your regular expression against the source to pull what you want, you can use external programs to format the html for easier parsing if you want before you scan it.
preg_match_all('/href="()"/i',$cr,$pm,PREG_SET_ORDER);

// So you can see what you found
print_r($pm);

// Display the results again :D
foreach($pm as $pv) echo $pv[1] . "\r\n";

?>

Hope this helps!

<?php

$ch = curl_init("http://www.example.com/");
$fp = fopen("example_homepage.txt", "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

curl_exec($ch);
curl_close($ch);
fclose($fp);
?>

Hi friends
I need some help from you guys.....I need a crawler such that it tracks the changes in the website content and it should show the track changes like oldcontent and newcontent should be shown side by side

This article has been dead for over six months. Start a new discussion instead.