| | |
how to make a crawler to fetch particular web page's content
Please support our PHP advertiser: PostgreSQL or MySQL? Compare and contrast the two most popular open source databases
![]() |
Hi Arun.N,
Sounds like your looking for cURL... Have a look at the cURL documentation and see what you think.
cURL + regular expressions (preg_match_all) = exactly what your looking for.
I've written a few of these "crawlers" myself, so I'll include some foundational code for a very a simple one for you:
Hope this helps!
Sounds like your looking for cURL... Have a look at the cURL documentation and see what you think.
cURL + regular expressions (preg_match_all) = exactly what your looking for.
I've written a few of these "crawlers" myself, so I'll include some foundational code for a very a simple one for you:
PHP Syntax (Toggle Plain Text)
<?php // Return a handle to a curl connection to the site you want to pull info from $ch = curl_init('http://finance.google.com/finance'); // Set some options for the connection curl_setopt($ch,CURLOPT_HEADER,0); // Don't return header information, although, this can be handy ;) curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); // Give us the page source // Open the connection with the options specified $cr = curl_exec($ch); // Run your regular expression against the source to pull what you want, you can use external programs to format the html for easier parsing if you want before you scan it. preg_match_all('/href="()"/i',$cr,$pm,PREG_SET_ORDER); // So you can see what you found print_r($pm); // Display the results again :D foreach($pm as $pv) echo $pv[1] . "\r\n"; ?>
Hope this helps!
Last edited by chrelad; Jan 3rd, 2008 at 3:36 pm. Reason: Forgot to link to cURL documentation for PHP
•
•
Join Date: Jan 2008
Posts: 2
Reputation:
Solved Threads: 0
<?php
$ch = curl_init("http://www.example.com/");
$fp = fopen("example_homepage.txt", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>
$ch = curl_init("http://www.example.com/");
$fp = fopen("example_homepage.txt", "w");
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>
www.99up.eu - free files upload, unlimited size, traffic.See latest files uploaded, maybe you will find something interesting.
![]() |
Other Threads in the PHP Forum
- Previous Thread: Read part of text file
- Next Thread: xampp not running php as well as html files
| Thread Tools | Search this Thread |
# 5.2.10 alexa apache api array beginner binary broken cakephp checkbox class clean clients cms code cron curl database date directory display dissertation dynamic echo echo$_get[x]changingitintovariable... email encode error fairness file files folder form forms function functions google href htaccess html image images include indentedsubcategory insert ip javascript joomla legislation limit link local login mail memberships menu mlm multiple multipletables mysql mysqlquery newsletters oop open paypal pdf persist php problem provider query radio random recursion remote rss script search server sessions simple sms sockets source space spam sql syntax system table tutorial update upload url validator variable video web youtube





