1,105,625 Community Members

Looping domain availability check - cURL soooo slow! ??

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

I have an example below from the registrar and when implemented (in a loop) it takes between 500ms to 1s between checks.
I need it to run as quickly as possible, whether it be through curl or without. It seems curl takes forever between checks.

I only need the script to check one domain.

1. Is there a way to loop a post of $url and return the output foreach results, much quicker than 500ms per check? Alternative to curl?
2. Running through curl, but checking the domain availability in such a way where it does not lag in checks?

I can post the variable $domainslist (an array with a limit of 20 domains :( , or in my case, 20 copies of the same domain), it will exec curl, check availability on each and output with only 0.00022 avg lag between checks. When I loop the curl exec it will then run the above array of domain in lightning speed time, lag 5-8 seconds and execute the 20 again in lightning speed time.

3. Is there some way to stop that 5-8 second lag between the checks of each array? I assume not, only because it must reconnect curl..

Any help checking availability in minimal time, no matter which method, would be very much appreciated! Thanks!

Code without loop

<?php

    function GetCurlPage ($pageSpec)
    {
      $ch = curl_init($pageSpec);
      curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
      curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
      curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
      $tmp = curl_exec ($ch);
      curl_close ($ch);
      $tmp = preg_replace('/(?s)<meta http-equiv="Expires"[^>]*>/i', '', $tmp);
      $tmp = explode('<br>', $tmp);
      echo $tmp[0];
      echo "<br>";
      echo $tmp[1];
      echo "<br>";
      return $tmp;
    }

$returnUrl = "http://www.mysite.com/check.php";
$url = "https://www.apisite.com/availability/check.php?domain=testdomain&suffixes=.com";
$output = GetCurlPage("$url");

?>
Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Bump?

Member Avatar
dos_killer
Junior Poster in Training
50 posts since Dec 2010
Reputation Points: 5 [?]
Q&As Helped to Solve: 4 [?]
Skill Endorsements: 0 [?]
 
0
 

if i were you...i would use of some programming language like python or java to use threads and do the job quicker for me...threads is by far the best means of solving this issue that i can think of...yer program running threads could return the result to yer php file...
im no professional ... but this is what i'd do ...

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Example please?
Thanks

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
0
 

Really two ways to do commonly do this.
First, using a socket

<?php
$fp = fsockopen("www.daniweb.com", 80, $errno, $errstr, 5);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    $out = "HEAD / HTTP/1.1\r\n";
    $out .= "Host: www.daniweb.com\r\n";
    $out .= "Connection: Close\r\n\r\n";
    fwrite($fp, $out);
    while (!feof($fp)) {
        echo fgets($fp);
    }
    fclose($fp);
}

Second using cURL

<?php
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, 'http://www.daniweb.com');
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);

// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_NOBODY, true);

$content = curl_exec ($ch);
curl_close ($ch);

Both of those only make a HEAD request so they don't actually load the page. This is the same way shorturl resolvers work, like on twitter. The only data returned should be the headers like this:

HTTP/1.1 200 OK
Date: Tue, 21 Dec 2010 18:42:34 GMT
Server: Apache/2.2
X-Powered-By: PHP/5.1.6
Set-Cookie: bblastvisit=1292956954; expires=Wed, 21-Dec-2011 18:42:34 GMT; path=/; domain=.daniweb.com
Set-Cookie: bblastactivity=0; expires=Wed, 21-Dec-2011 18:42:34 GMT; path=/; domain=.daniweb.com
Cache-Control: private
Pragma: private
X-UA-Compatible: IE=7
Vary: Accept-Encoding
Content-Type: text/html; charset=utf-8

If you don't need the response code, simply a check like this:

$fp = fsockopen("www.daniweb.com", 80, $errno, $errstr, 5);
if (!$fp) {
    echo "$errstr ($errno)<br />\n";
} else {
    fclose($fp);
}

I checked both of the first ones and they average about 1s, the last check averages 0.002s to complete. But this will not give you the availability of the response code.

There is also the ability to run multiple curl commands in parallel: http://www.php.net/manual/en/function.curl-multi-exec.php

Member Avatar
dos_killer
Junior Poster in Training
50 posts since Dec 2010
Reputation Points: 5 [?]
Q&As Helped to Solve: 4 [?]
Skill Endorsements: 0 [?]
 
0
 

its difficult to find an example..and lil long to write a code myself...but if you want to use python...you will have to make use of the Thread libray and the urllib for connecting to files on a server....(study up on both of them ...wont take more than a day to learn them and get started ) ...
else you can wait for better answers i guess...but i dont think anything else would make it faster than threading ...

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

0.002s would be perfect! But I need the availability response :( is this what is slowing it down to 1s?
How else can I get a response if available/unavailable without wasting too much time more than 0.002s?

Thank you!

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
0
 

The problem is, even though you only need the headers, a lot of time the web server will step in and execute the request as a GET and then return just the HEAD of the request. I'm not sure if this is still the problem, but I did work on a script that checked pages and availability across many servers and the best solution I found was batch executing about 10 curl requests in parallel using the multi curl functions. You wait until all requests complete to get a result, but essentially if you send 10 and each on average takes about 1s to return the whole request takes maybe 2s to complete all 10.

Member Avatar
dos_killer
Junior Poster in Training
50 posts since Dec 2010
Reputation Points: 5 [?]
Q&As Helped to Solve: 4 [?]
Skill Endorsements: 0 [?]
 
0
 

wow i had no idea of curl_multi_exec()
thanks mschroeder

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Ok multi curl in parallel may be the only way then :( do you have sample code to assist? Much appreciated!

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
0
 

Something like this will work for you. It will only return the headers as results.

$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.yahoo.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 20);
	curl_setopt($curl_arr[$i], CURLOPT_NOBODY, true);
	curl_setopt($curl_arr[$i], CURLOPT_HEADER, true);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	var_dump($results);
}
$end = microtime(true);
Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Great! As you said, it returns only the header.. ie:

string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html " string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html " string(130) "HTTP/1.1 200 OK Date: Wed, 22 Dec 2010 06:15:02 GMT Server: Apache X-Powered-By: PHP/4.4.4-8+etch6 Content-Type: text/html "

string (130) is the results?

How can I get it to output the results of the response (available / not available)?
Looking great so far, thanks!

Member Avatar
pritaeas
mod_pritaeas
11,316 posts since Jul 2006
Reputation Points: 1,420 [?]
Q&As Helped to Solve: 1,835 [?]
Skill Endorsements: 156 [?]
Moderator
Featured
Sponsor
 
0
 

Try an exsisting domain and a non-existant domain, and see if the headers show different results. Compare more then once to confirm your finding.

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
1
 

If you had the xdebug php extension installed, which is what I use on my development machine, hence why I use var_dump a lot more than print, you would see that those headers are actually a series of lines separated by newline chars. If you view source on your output you will probably see that the output has newlines in there.

string 'HTTP/1.1 200 OK

Date: Wed, 22 Dec 2010 13:43:24 GMT

Server: Apache/2.2

X-Powered-By: PHP/5.1.6

Set-Cookie: bblastvisit=1293025404; expires=Thu, 22-Dec-2011 13:43:24 GMT; path=/; domain=.daniweb.com

Set-Cookie: bblastactivity=0; expires=Thu, 22-Dec-2011 13:43:24 GMT; path=/; domain=.daniweb.com

Cache-Control: private

Pragma: private

X-UA-Compatible: IE=7

Vary: Accept-Encoding

Content-Type: text/html; charset=utf-8



' (length=430)

So knowing that, you essentially want to get the first line of the request. This can be done with explode, or a substring etc, whatever way you prefer. But you want to parse out the first line "HTTP/1.1 200 OK" for the response code, 200 in this case. Here is a good breakdown of the common ones: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

How you choose to get the response code from that string is up to you, consider regular expressions, exploding it via the spaces, using strpos to find the first space and then taking a substr of that through the next three characters etc.

If you need more help post up what you're attempting and we'll go from there.

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Ok great, thanks. I had a look at the source and noticed exactly what you said, separated by \n. I exploded $results and returned the first line.

I'm not actually looking for a response code, as it will return a 200 success code if the domain is available or not available. The API should send a return output of "available:<br>not available:<br>failed query:". Each domain checked will sit within each of those. That is what I need :)

Updated code:

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.yahoo.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true);
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_CONNECTTIMEOUT, 20);
	curl_setopt($curl_arr[$i], CURLOPT_NOBODY, true);
	curl_setopt($curl_arr[$i], CURLOPT_HEADER, true);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("\n", $results);
	  echo $results[0];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}
$end = microtime(true);

?>

Output:
HTTP/1.1 200 OK
01:28:30:633099

HTTP/1.1 200 OK
01:28:30:633423

HTTP/1.1 200 OK
01:28:30:633493


Thanks again mschroeder!

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Ok I've got this running :D

Last thing I need to do is this..

See how I have entered 3 different URL's in the array? I would like to only check one URL for availability, but let's say for 100 times. Rather than typing out the url 100 times in the array, ie: $nodes = array('url1.com', 'url2.com', 'url3.com', 'url4.com', ..... ); is there another way I can do this so it adds a 100 count of that one url into the array on execution, dynamically?


Thanks!


Current code:

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$nodes = array('http://www.google.com', 'http://www.daniweb.com', 'http://www.testavailabledomain.com');
$curl_arr = array();
$master = curl_multi_init();

for($i = 0, $count=count($nodes); $i < $count; $i++)
{
	$url =$nodes[$i];
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $nodes[$i] );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i = 0; $i < $count; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}

?>

Output:

available:
not available: www.google.com
04:14:09:153321

available:
not available: www.daniweb.com
04:14:09:153639

available: www.testavailabledomain.com
not available:
04:14:09:153720

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
0
 
$url = 'http://www.website.com';
$curl_arr = array();
$master = curl_multi_init();

for($i=0; $i<100; $i++){

  //Add curl to array like before but use $url instead of $nodes[$i]
}

....

for($i = 0; $i<100; $i++)
{
	$results = curl_multi_getcontent  ( $curl_arr[$i]  );
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}

something like that should give you the general idea.

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Errors with;

Fatal error: Maximum execution time of 30 seconds exceeded in /home/../.../../file.php on line 27

Warning: (null)(): 2 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 3 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 4 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 5 is not a valid cURL handle resource in Unknown on line 0
Warning: (null)(): 6 is not a valid cURL handle resource in Unknown on line 0

etc etc


Is it also able to output each time it does a check rather than loading the url 100 times and outputting all at once?

<?php

	function udate($format, $utimestamp = null) {
	  if ($utimestamp === null)
		$utimestamp = microtime(true);
	  $timestamp = floor($utimestamp);
	  $milliseconds = round(($utimestamp - $timestamp) * 1000000);
	  return date(preg_replace('`(?<!\\\\)u`', $milliseconds, $format), $timestamp);
	}
	
$url = 'http://www.domain.com';
$curl_arr = array();
$master = curl_multi_init();

for($i=0; $i<100; $i++)
{
	$curl_arr[$i] = curl_init();
	curl_setopt($curl_arr[$i], CURLOPT_URL, $url );
	curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYHOST, FALSE);
	curl_setopt($curl_arr[$i], CURLOPT_SSL_VERIFYPEER, FALSE);
	
	curl_multi_add_handle($master, $curl_arr[$i]);
}

do {
    curl_multi_exec($master,$running);
} while($running > 0);

for($i=0; $i<100; $i++)
{
	$results = curl_multi_getcontent ($curl_arr[$i]);
	$results = explode("<br>", $results);
	  echo $results[0];
	  echo "<br>";
	  echo $results[1];
	  echo "<br>";
	  echo udate('H:i:s:u');
	  echo "<br><br>";
}
?>

Thank you again so much!

Member Avatar
mschroeder
Work Harder
655 posts since Jul 2008
Reputation Points: 251 [?]
Q&As Helped to Solve: 134 [?]
Skill Endorsements: 8 [?]
Team Colleague
 
0
 

url isnt an array anymore

Member Avatar
SuPrAiCeR69
Newbie Poster
20 posts since Dec 2008
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

url isnt an array anymore

Oops! Ok.. fixed that.
See above post for errors and updated code.

You
This article has been dead for over three months: Start a new discussion instead
Post:
Start New Discussion
View similar articles that have also been tagged: