why curl function in php doesnt work for some sites like dir.yahoo.com?

function getThePage($page_url)
 $options = array(
        CURLOPT_RETURNTRANSFER => true,     // return web page
        CURLOPT_HEADER         => false,    // don't return headers
        CURLOPT_FOLLOWLOCATION => true,     // follow redirects
        CURLOPT_ENCODING       => "",       // handle all encodings
        CURLOPT_USERAGENT      => "spider", // who am i
        CURLOPT_AUTOREFERER    => true,     // set referer on redirect
        CURLOPT_CONNECTTIMEOUT => 120,      // timeout on connect
        CURLOPT_TIMEOUT        => 120,      // timeout on response
        CURLOPT_MAXREDIRS      => 10,       // stop after 10 redirects
$ch=curl_init($page_url)or die("Cannot initialize");
curl_setopt_array($ch,$options)or die("Cannot set options");
$content=curl_exec($ch)or die("Cannot execute");



i used this code but i cant open some sites like http://dir.yahoo.com
www.cemunnar.org etc

Recommended Answers

All 4 Replies

instead of CURLOPT_USERAGENT => "spider", // who am i try proving a value that identifies a popular browser. For example, try this instead: CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9) Gecko/2008052906 Firefox/3.0', // who am i

It's also important to note that overuse (Multiple requests made in under a couple seconds) of websites such as Yahoo, Google, or Bing can get your IP banned from accessing their services. Furthermore, if you are on a shared IP your chances of being blocked are higher since there is a chance of multiple cURL connections to one site in a short amount of time.

hielo you code doesn't work.

hielo you code doesn't work.

I respectfully disagree. It works fine for me. Are you perhaps sending too many requests in short period of time to their network as FlashCreations stated?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.