<?

$url = 'http://www.gamespot.com/pages/search/search_ajax.php?q=grid&type=game&offset=0&tags_only=false&sort=rank';

# regular page:
#url = '://www.gamespot.com/search.html?type=11&stype=all&qs=grid';

function disguise_curl($url)
{
  $curl = curl_init();

  $header[0] = "Accept: text/xml,application/xml,application/xhtml+xml";
  $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
  $header[] = "Cache-Control: max-age=0";
  $header[] = "Connection: keep-alive";
  $header[] = "Keep-Alive: 300";
  $header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
  $header[] = "Accept-Language: en-us,en;q=0.5";
  $header[] = "Pragma: ";

  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1');
  curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
  curl_setopt($curl, CURLOPT_REFERER, 'http://www.gamespot.com/search.html?type=11&stype=all&qs=grid');
  curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
  curl_setopt($curl, CURLOPT_AUTOREFERER, true);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($curl, CURLOPT_TIMEOUT, 10);
  curl_setopt($curl, CURLOPT_COOKIEJAR, 'cookie_gamespot.txt');
  curl_setopt($curl, CURLOPT_COOKIEFILE, 'cookie_gamespot.txt');

  $html = curl_exec($curl);
  curl_close($curl);

  return $html;
}

$text = disguise_curl($url);
echo $text;

?>

Hi, can anyone test this example and tell me how to I can get the content of new ajax search of gamespot page?

I don't want to steal the gamespot content, I just need read the gamesearch result and the game rate, to later place link to game profile page of gamespot.com.

Thank you!

Recommended Answers

All 14 Replies

Member Avatar for langsor

Of course, if you weren't trying to disguise yourself as a user-agent, you could do this instead.

<?php
$url = 'http://www.google.com';
print file_get_contents( $url ); 
?>

Cheers

Of course, if you weren't trying to disguise yourself as a user-agent, you could do this instead.

<?php
$url = 'http://www.google.com';
print file_get_contents( $url ); 
?>

Cheers

Same problem, will get page without the search block result.

Member Avatar for langsor

I'm not getting the nested-body content of the Gamespot page using cURL (probably relative linked iframe) but it appears your PHP script is working in general.

If you are trying to figure out how to return these results to a JavaScript function using Ajax and/or get the $url passed in via Ajax then you need only change your static $url at the top of the page with something like this...

if ( $url = $_POST['url'] ) {
  // perform cURL code here
}

...and that should take care of the server-side end of the equation.

If you are new to JavaScript Ajax methods, it is a big topic for a single thread and is very well documented in many tutorials all over the web and likely on this site too -- I would search around.

Otherwise I am misunderstanding the problem you are having.

Good luck

Sorry, but I think you don't understand what I need.

I just want grabble the search result block of gamespot, my example script is only to show for you guys my problem with blank issue because gamespot change the results recently to ajax driven.

If you can get correctly the page bellow by php, show how to.

http://www.gamespot.com/search.html?type=11&stype=all&qs=grid

Thank you!

Sorry, but I think you don't understand what I need.

I just want grabble the search result block of gamespot, my example script is only to show for you guys my problem with blank issue because gamespot change the results recently to ajax driven.

If you can get correctly the page bellow by php, show how to.

http://www.gamespot.com/search.html?type=11&stype=all&qs=grid

Thank you!

You are going to have to go to gamespot and pull their javascript apart to find the url that their ajax call is going to because I'm sure that it is pulled in after the initial load time.

Member Avatar for langsor

Okay, I may be mistaken, but I don't believe this can be done -- or at least I have no idea how to do it.

Sorry

Member Avatar for langsor

It appears the problem you are having with this approach is what Rob is referring to in the above post. You are getting the contents of the page while the Ajax on that page is still pulling results from their database. Then they load the search results but you've already loaded the page locally without those results.

Ajax has protection (sandbox security) around talking to scripts not on its own server, so I suspect that even if you get access to their Ajax javascript page, you would still not be able to communicate with it.

It appears the problem you are having with this approach is what Rob is referring to in the above post. You are getting the contents of the page while the Ajax on that page is still pulling results from their database. Then they load the search results but you've already loaded the page locally without those results.

Ajax has protection (sandbox security) around talking to scripts not on its own server, so I suspect that even if you get access to their Ajax javascript page, you would still not be able to communicate with it.

I know that I have mine shutdown tight, they only allow requests from within my docroot.

But then again, it depends on the security within that server side script. For instance, my ajax calls reference a php file with post or get variables, so I can still post to this same document with just a form. But the key is that my file does not allow any post or get requests to run it if that request is not made from within my docroot, they may have chosen not to be that secure, the only way to tell is to try.

Member Avatar for langsor

http://www.gamespot.com/search.html?qs=test is not ajax.

Honestly, I didn't spend a ton of time digging through the messy source code on that pge ... but if you use PHP to call that page with a query string, the query shows up in the search box but with no results in the body field. So I'm guessing the search results are somehow restricted from off site access...which makes me think Ajax.

I'm open to other options though.

:-)

Hi again guys,

Here my final test working code:

<? 

$curl = curl_init(); 

$url = 'http://www.gamespot.com/pages/search/search_ajax.php?q=grid&type=game&offset=0&tags_only=false&sort=rank'; 

curl_setopt($curl, CURLOPT_URL, $url); 
curl_setopt($curl, CURLOPT_HTTPHEADER, array("Accept: application/json","X-Requested-With: XMLHttpRequest")); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 

$result = curl_exec($curl); 

echo $result; 

?>

The solution? just see the headers that I set.

Cya

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.