Looping domain availability check - cURL soooo slow! ??

Question

SuPrAiCeR69 0 Newbie Poster

13 Years Ago

I have an example below from the registrar and when implemented (in a loop) it takes between 500ms to 1s between checks.
I need it to run as quickly as possible, whether it be through curl or without. It seems curl takes forever between checks.

I only need the script to check one domain.

1. Is there a way to loop a post of $url and return the output foreach results, much quicker than 500ms per check? Alternative to curl?
2. Running through curl, but checking the domain availability in such a way where it does not lag in checks?

I can post the variable $domainslist (an array with a limit of 20 domains :( , or in my case, 20 copies of the same domain), it will exec curl, check availability on each and output with only 0.00022 avg lag between checks. When I loop the curl exec it will then run the above array of domain in lightning speed time, lag 5-8 seconds and execute the 20 again in lightning speed time.

3. Is there some way to stop that 5-8 second lag between the checks of each array? I assume not, only because it must reconnect curl..

Any help checking availability in minimal time, no matter which method, would be very much appreciated! Thanks!

Code without loop

<?php

    function GetCurlPage ($pageSpec)
    {
      $ch = curl_init($pageSpec);
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
      curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
      $tmp = curl_exec ($ch);
      curl_close ($ch);
      $tmp = preg_replace('/(?s)<meta http-equiv="Expires"[^>]*>/i', '', $tmp);
      $tmp = explode('<br>', $tmp);
      echo $tmp[0];
      echo "<br>";
      echo $tmp[1];
      echo "<br>";
      return $tmp;
    }

$returnUrl = "http://www.mysite.com/check.php";
$url = "https://www.apisite.com/availability/check.php?domain=testdomain&suffixes=.com";
$output = GetCurlPage("$url");

?>

api curl php

4 Contributors
24 Replies
2K Views
2 Days Discussion Span
Latest Post 13 Years Ago Latest Post by mschroeder

All 24 Replies

dos_killer 5 Junior Poster in Training

13 Years Ago

if i were you...i would use of some programming language like python or java to use threads and do the job quicker for me...threads is by far the best means of solving this issue that i can think of...yer program running threads could return the result to yer php file...
im no professional ... but this is what i'd do ...

mschroeder 251 Bestower of Knowledge

13 Years Ago

Really two ways to do commonly do this.
First, using a socket

<?php
$fp = fsockopen("www.daniweb.com", 80, $errno, $errstr, 5);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "HEAD / HTTP/1.1\r\n";
    $out .= "Host: www.daniweb.com\r\n";
    $out .= "Connection: Close\r\n\r\n";
    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp);
    }
    fclose($fp);
}

Second using cURL

<?php
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, 'http://www.daniweb.com');
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);

// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_NOBODY, true);

$content = curl_exec ($ch);
curl_close ($ch);

Both of those only make a HEAD request so they don't actually load the page. This is the same way shorturl resolvers work, like on twitter. The only data returned should be the headers like this:

HTTP/1.1 200 OK
Date: Tue, 21 Dec 2010 18:42:34 GMT
Server: Apache/2.2
X-Powered-By: PHP/5.1.6
Set-Cookie: bblastvisit=1292956954; expires=Wed, 21-Dec-2011 18:42:34 GMT; path=/; domain=.daniweb.com
Set-Cookie: bblastactivity=0; expires=Wed, 21-Dec-2011 18:42:34 GMT; path=/; domain=.daniweb.com
Cache-Control: private
Pragma: private
X-UA-Compatible: IE=7
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8

If you don't need the response code, simply a check like this:

$fp = fsockopen("www.daniweb.com", 80, $errno, $errstr, 5);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    fclose($fp);
}

I checked both of the first ones and they average about 1s, the last check averages 0.002s to complete. But this will not give you the availability of the response code.

There is also the ability to run multiple curl commands in parallel: http://www.php.net/manual/en/function.curl-multi-exec.php

Edited 13 Years Ago by mschroeder because: n/a

dos_killer 5 Junior Poster in Training

13 Years Ago

its difficult to find an example..and lil long to write a code myself...but if you want to use python...you will have to make use of the Thread libray and the urllib for connecting to files on a server....(study up on both of them ...wont take more than a day to learn them and get started ) ...
else you can wait for better answers i guess...but i dont think anything else would make it faster than threading ...

mschroeder 251 Bestower of Knowledge

13 Years Ago

The problem is, even though you only need the headers, a lot of time the web server will step in and execute the request as a GET and then return just the HEAD of the request. I'm not sure if this is still the problem, but I did work on a script that checked pages and availability across many servers and the best solution I found was batch executing about 10 curl requests in parallel using the multi curl functions. You wait until all requests complete to get a result, but essentially if you send 10 and each on average takes about 1s to return the whole request takes maybe 2s to complete all 10.

mschroeder 251 Bestower of Knowledge

13 Years Ago

If you had the xdebug php extension installed, which is what I use on my development machine, hence why I use var_dump a lot more than print, you would see that those headers are actually a series of lines separated by newline chars. If you view source on your output you will probably see that the output has newlines in there.

string 'HTTP/1.1 200 OK

Date: Wed, 22 Dec 2010 13:43:24 GMT

Server: Apache/2.2

X-Powered-By: PHP/5.1.6

Set-Cookie: bblastvisit=1293025404; expires=Thu, 22-Dec-2011 13:43:24 GMT; path=/; domain=.daniweb.com

Set-Cookie: bblastactivity=0; expires=Thu, 22-Dec-2011 13:43:24 GMT; path=/; domain=.daniweb.com

Cache-Control: private

Pragma: private

X-UA-Compatible: IE=7

Vary: Accept-Encoding

Content-Type: text/html; charset=utf-8



' (length=430)

So knowing that, you essentially want to get the first line of the request. This can be done with explode, or a substring etc, whatever way you prefer. But you want to parse out the first line "HTTP/1.1 200 OK" for the response code, 200 in this case. Here is a good breakdown of the common ones: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

How you choose to get the response code from that string is up to you, consider regular expressions, exploding it via the spaces, using strpos to find the first space and then taking a substr of that through the next three characters etc.

If you need more help post up what you're attempting and we'll go from there.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

SuPrAiCeR69 0 Newbie Poster · Answer 1 · 2010-12-22T00:44:24+00:00

SuPrAiCeR69 0 Newbie Poster

13 Years Ago

Bump?

SuPrAiCeR69 0 Newbie Poster · Answer 2 · 2010-12-22T00:51:32+00:00

SuPrAiCeR69 0 Newbie Poster

13 Years Ago

Example please?
Thanks

SuPrAiCeR69 0 Newbie Poster · Answer 3 · 2010-12-22T01:01:06+00:00

0.002s would be perfect! But I need the availability response :( is this what is slowing it down to 1s?
How else can I get a response if available/unavailable without wasting too much time more than 0.002s?

Thank you!

dos_killer 5 Junior Poster in Training · Answer 4 · 2010-12-22T01:07:00+00:00

wow i had no idea of curl_multi_exec()
thanks mschroeder

SuPrAiCeR69 0 Newbie Poster · Answer 5 · 2010-12-22T01:10:00+00:00

Ok multi curl in parallel may be the only way then :( do you have sample code to assist? Much appreciated!

mschroeder 251 Bestower of Knowledge Team Colleague · Answer 6 · 2010-12-22T01:18:40+00:00

Something like this will work for you. It will only return the headers as results.

$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.yahoo.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 20);
	curl_setopt($curl_arr[$i], CURLOPT_NOBODY, true);
	curl_setopt($curl_arr[$i], CURLOPT_HEADER, true);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	var_dump($results);
}
$end = microtime(true);

SuPrAiCeR69 0 Newbie Poster · Answer 7 · 2010-12-22T12:16:37+00:00

Great! As you said, it returns only the header.. ie:

string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html " string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html " string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html "

string (130) is the results?

How can I get it to output the results of the response (available / not available)?
Looking great so far, thanks!

pritaeas 2,194 ¯\_(ツ)_/¯ Moderator Featured Poster · Answer 8 · 2010-12-22T14:23:42+00:00

Try an exsisting domain and a non-existant domain, and see if the headers show different results. Compare more then once to confirm your finding.

SuPrAiCeR69 0 Newbie Poster · Answer 9 · 2010-12-22T20:36:25+00:00

Ok great, thanks. I had a look at the source and noticed exactly what you said, separated by \n. I exploded $results and returned the first line.

I'm not actually looking for a response code, as it will return a 200 success code if the domain is available or not available. The API should send a return output of "available:<br>not available:<br>failed query:". Each domain checked will sit within each of those. That is what I need :)

Updated code:

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.yahoo.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 20);
	curl_setopt($curl_arr[$i], CURLOPT_NOBODY, true);
	curl_setopt($curl_arr[$i], CURLOPT_HEADER, true);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("\n", $results);
	  echo $results[0];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}
$end = microtime(true);

?>

Output:
HTTP/1.1 200 OK
01:28:30:633099

HTTP/1.1 200 OK
01:28:30:633423

HTTP/1.1 200 OK
01:28:30:633493

Thanks again mschroeder!

SuPrAiCeR69 0 Newbie Poster · Answer 10 · 2010-12-22T23:21:26+00:00

Ok I've got this running :D

Last thing I need to do is this..

See how I have entered 3 different URL's in the array? I would like to only check one URL for availability, but let's say for 100 times. Rather than typing out the url 100 times in the array, ie: $nodes = array('url1.com', 'url2.com', 'url3.com', 'url4.com', ..... ); is there another way I can do this so it adds a 100 count of that one url into the array on execution, dynamically?

Thanks!

Current code:

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.testavailabledomain.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}

?>

Output:

available:
not available: www.google.com
04:14:09:153321

available:
not available: www.daniweb.com
04:14:09:153639

available: www.testavailabledomain.com
not available:
04:14:09:153720

mschroeder 251 Bestower of Knowledge Team Colleague · Answer 11 · 2010-12-23T00:19:53+00:00

$url = 'http://www.website.com';
$curl_arr = array();
$master = curl_multi_init();

for($i=0; $i<100; $i++){

  //Add curl to array like before but use $url instead of $nodes[$i]
}

....

for($i = 0; $i<100; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}

something like that should give you the general idea.

SuPrAiCeR69 0 Newbie Poster · Answer 12 · 2010-12-23T00:47:10+00:00

Errors with;

Fatal error: Maximum execution time of 30 seconds exceeded in /home/../.../../file.php on line 27

Warning: (null)(): 2 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 3 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 4 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 5 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 6 is not a valid cURL handle resource in Unknown on line 0

etc etc

Is it also able to output each time it does a check rather than loading the url 100 times and outputting all at once?

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$url = 'http://www.domain.com';
$curl_arr = array();
$master = curl_multi_init();

for($i=0; $i<100; $i++)
{
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $url );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i=0; $i<100; $i++)
{
	$results = curl_multi_getcontent ($curl_arr[$i]);
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}
?>

Thank you again so much!

mschroeder 251 Bestower of Knowledge Team Colleague · Answer 13 · 2010-12-23T00:51:10+00:00

mschroeder 251 Bestower of Knowledge

13 Years Ago

url isnt an array anymore

SuPrAiCeR69 0 Newbie Poster · Answer 14 · 2010-12-23T00:58:40+00:00

url isnt an array anymore

Oops! Ok.. fixed that.
See above post for errors and updated code.

SuPrAiCeR69 0 Newbie Poster · Answer 15 · 2010-12-23T22:16:28+00:00

So the issue (refer to latest chunk of code a few posts ago) is on;

curl_multi_exec($master,$running);

and for each iteration it is outputting;

Warning: (null)(): 2 is not a valid cURL handle resource in Unknown on line 0

mschroeder 251 Bestower of Knowledge Team Colleague · Answer 16 · 2010-12-23T23:29:34+00:00

The code below works flawlessly for me.
What version of php and curl are you using? You can find the curl version in phpinfo

<?php
$url = 'http://www.daniweb.com';
$curl_arr = array();
$master = curl_multi_init();
 
for($i=0; $i<10; $i++)
{
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $url );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
 
	curl_multi_add_handle($master, $curl_arr[$i]);
}
 
do {
    curl_multi_exec($master,$running);
} while($running > 0);
 
for($i=0; $i<10; $i++)
{
	$result = curl_multi_getcontent ($curl_arr[$i]);
	var_dump($result);
}

SuPrAiCeR69 0 Newbie Poster · Answer 17 · 2010-12-23T23:34:24+00:00

PHP 5.2.14

curl
cURL support enabled
cURL Information libcurl/7.21.0 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5

SuPrAiCeR69 0 Newbie Poster · Answer 18 · 2010-12-23T23:40:38+00:00

Try a loop of 100 - does yours timeout on your script? It does on my end.

mschroeder 251 Bestower of Knowledge Team Colleague · Answer 19 · 2010-12-23T23:53:47+00:00

<?php
$url = 'http://domain.com';
$curl_arr = array();
$master = curl_multi_init();
 
for($i=0; $i<100; $i++)
{
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $url );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
	//curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 20);
	//curl_setopt($curl_arr[$i], CURLOPT_NOBODY, true);
	//curl_setopt($curl_arr[$i], CURLOPT_HEADER, true);
 
	curl_multi_add_handle($master, $curl_arr[$i]);
}
 
do {
    curl_multi_exec($master,$running);
} while($running > 0);
 
for($i=0; $i<100; $i++)
{
	$result = curl_multi_getcontent ($curl_arr[$i]);
	var_dump($result);
}

I've run this code with and without those three lines commented out.
Each time I got 100 results without an issue and no problems.

If php is timing out, you could extend the timeout on each loop iteration, but that to me says something is timing out with one of your connections. Add the CURLOPT_CONNECTTIMEOUT and set the value low. Lower the better but more likely the connection will timeout.

Also, if you're displaying the full content from each connection the actual echoing of the data is going to consume a lot of execution time.

Looping domain availability check - cURL soooo slow! ??

Recommended Answers Collapse Answers

All 24 Replies

Recommended Answers