0

why curl function in php doesnt work for some sites like dir.yahoo.com?

function getThePage($page_url)
{
 $options = array(
        CURLOPT_RETURNTRANSFER => true,     // return web page
        CURLOPT_HEADER         => false,    // don't return headers
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects
        CURLOPT_ENCODING       => "",       // handle all encodings
        CURLOPT_USERAGENT      => "spider", // who am i
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
        CURLOPT_TIMEOUT        => 120,      // timeout on response
        CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects
    );
$ch=curl_init($page_url)or die("Cannot initialize");
curl_setopt_array($ch,$options)or die("Cannot set options");
$content=curl_exec($ch)or die("Cannot execute");

$err=curl_errno($ch);
$errmsg=curl_error($ch);
$header=curl_getinfo($ch);
curl_close($ch);

$header['errno']=$err;
$header['errmsg']=$errmsg;
$header['content']=$content;
return($header);
}

i used this code but i cant open some sites like http://dir.yahoo.com
www.cemunnar.org etc

Edited by peter_budo: Keep It Clear - Do wrap your programming code blocks within [code] ... [/code] tags

3
Contributors
4
Replies
7
Views
6 Years
Discussion Span
Last Post by hielo
0

instead of CURLOPT_USERAGENT => "spider", // who am i try proving a value that identifies a popular browser. For example, try this instead: CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9) Gecko/2008052906 Firefox/3.0', // who am i

Edited by hielo: typo

0

It's also important to note that overuse (Multiple requests made in under a couple seconds) of websites such as Yahoo, Google, or Bing can get your IP banned from accessing their services. Furthermore, if you are on a shared IP your chances of being blocked are higher since there is a chance of multiple cURL connections to one site in a short amount of time.

0

hielo you code doesn't work.

I respectfully disagree. It works fine for me. Are you perhaps sending too many requests in short period of time to their network as FlashCreations stated?

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.