how to make a crawler to fetch particular web page's content

Reply

Join Date: Jan 2008
Posts: 1
Reputation: Arun.N is an unknown quantity at this point 
Solved Threads: 0
Arun.N Arun.N is offline Offline
Newbie Poster

how to make a crawler to fetch particular web page's content

 
0
  #1
Jan 3rd, 2008
i try to make a crawler that crawls a web page & retrieves
the stock information from google,but can't do it .
so plz help me 2 make that type of crawler.
urgent plz...
Reply With Quote Quick reply to this message  
Join Date: Nov 2007
Posts: 30
Reputation: chrelad is an unknown quantity at this point 
Solved Threads: 5
chrelad's Avatar
chrelad chrelad is offline Offline
Light Poster

Re: how to make a crawler to fetch particular web page's content

 
0
  #2
Jan 3rd, 2008
Hi Arun.N,

Sounds like your looking for cURL... Have a look at the cURL documentation and see what you think.

cURL + regular expressions (preg_match_all) = exactly what your looking for.

I've written a few of these "crawlers" myself, so I'll include some foundational code for a very a simple one for you:

  1. <?php
  2.  
  3. // Return a handle to a curl connection to the site you want to pull info from
  4. $ch = curl_init('http://finance.google.com/finance');
  5.  
  6. // Set some options for the connection
  7. curl_setopt($ch,CURLOPT_HEADER,0); // Don't return header information, although, this can be handy ;)
  8. curl_setopt($ch,CURLOPT_RETURNTRANSFER,1); // Give us the page source
  9.  
  10. // Open the connection with the options specified
  11. $cr = curl_exec($ch);
  12.  
  13. // Run your regular expression against the source to pull what you want, you can use external programs to format the html for easier parsing if you want before you scan it.
  14. preg_match_all('/href="()"/i',$cr,$pm,PREG_SET_ORDER);
  15.  
  16. // So you can see what you found
  17. print_r($pm);
  18.  
  19. // Display the results again :D
  20. foreach($pm as $pv) echo $pv[1] . "\r\n";
  21.  
  22. ?>

Hope this helps!
Last edited by chrelad; Jan 3rd, 2008 at 3:36 pm. Reason: Forgot to link to cURL documentation for PHP
Reply With Quote Quick reply to this message  
Join Date: Jan 2008
Posts: 2
Reputation: mario.stoica is an unknown quantity at this point 
Solved Threads: 0
mario.stoica mario.stoica is offline Offline
Newbie Poster

Re: how to make a crawler to fetch particular web page's content

 
0
  #3
Jan 3rd, 2008
<?php

$ch = curl_init("http://www.example.com/");
$fp = fopen("example_homepage.txt", "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

curl_exec($ch);
curl_close($ch);
fclose($fp);
?>
www.99up.eu - free files upload, unlimited size, traffic.See latest files uploaded, maybe you will find something interesting.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Other Threads in the PHP Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC