Enter keyword in search box of any website using php script

Reply

Join Date: Sep 2008
Posts: 33
Reputation: jyotiu is an unknown quantity at this point 
Solved Threads: 0
jyotiu jyotiu is offline Offline
Light Poster

Enter keyword in search box of any website using php script

 
0
  #1
Apr 4th, 2009
Hi all

I am a beginner and i was wondering can we write a code in PHP by which we can enter keyword in search box of any website and submit.

so there will be 2 inputs to this php function:

  1. function enter_keyword($website_url, $keyword)
  2. {
  3.  
  4. }

It will be a great help
thanks in advance
Reply With Quote Quick reply to this message  
Join Date: Sep 2007
Posts: 1,449
Reputation: cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about 
Solved Threads: 135
cwarn23's Avatar
cwarn23 cwarn23 is offline Offline
Nearly a Posting Virtuoso

Re: Enter keyword in search box of any website using php script

 
0
  #2
Apr 4th, 2009
Are you talking about a 'web page' or an entire 'web site'. Because with a webpage you can just use file_get_contents() to retrieve the data then use regex to find the keywords. However, if you are talking about a 'website' or network of pages then that will require a bot that indexs the website to a database and the search form would check the database. Both ways I can help you just need to know which one.
Try not to bump 10 year old threads as it can be really annoying.
Like php then read my website at http://syntax.cwarn23.net/
Star-Trek-Atlantis - now that's what I call a movie ^_^
My favourite PC. - MacGyver Fan
Bad english note: dis-iz-2b4u
Reply With Quote Quick reply to this message  
Join Date: Sep 2008
Posts: 33
Reputation: jyotiu is an unknown quantity at this point 
Solved Threads: 0
jyotiu jyotiu is offline Offline
Light Poster

Re: Enter keyword in search box of any website using php script

 
0
  #3
Apr 4th, 2009
Originally Posted by cwarn23 View Post
Are you talking about a 'web page' or an entire 'web site'. Because with a webpage you can just use file_get_contents() to retrieve the data then use regex to find the keywords. However, if you are talking about a 'website' or network of pages then that will require a bot that indexs the website to a database and the search form would check the database. Both ways I can help you just need to know which one.
Hi cwarn23

thanks for the reply
following needs to be done::

1.) I want to send a keyword for eg: 'knowing' to website for eg: "http://stagevu.com/" search box and submit it.

2.) then i want to return the url of first result from the result page.

i know that (2.) can be done using CURL but what about first.

thanks again for taking out your time for replying
jyotiu
Reply With Quote Quick reply to this message  
Join Date: Sep 2007
Posts: 1,449
Reputation: cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about 
Solved Threads: 135
cwarn23's Avatar
cwarn23 cwarn23 is offline Offline
Nearly a Posting Virtuoso

Re: Enter keyword in search box of any website using php script

 
0
  #4
Apr 4th, 2009
Although you didn't entirely answer my question (scan the website or single webpage) I will assume you want to scan the website in which case will require a bot. I have recently written a bot to scan for site security holes and the bot template is as follows:
  1. <?
  2. set_time_limit(0);
  3. function domain($domainb) {
  4. $bits = explode('/', $domainb);
  5. if ($bits[0]=='http:' || $bits[0]=='https:')
  6. {
  7. return $bits[0].'//'.$bits[2].'/';
  8. } else {
  9. return 'http://'.$bits[0].'/';
  10. }
  11. unset($bits);
  12. }
  13. if (isset($_GET['site'])) {
  14. echo '<head><title>Bot scanning website - '.domain($_GET['site']).'</title></head><body>';
  15. } else {
  16. echo '<head><title>Bot scanner</title></head><body>';
  17. }
  18. echo '<center><font size=5 face=\'arial black\'><b>PHP Bot Scanner</b></font><br><form method=\'get\' style=\'margin:0px; padding:0px;\'><input type=\'text\' name=\'site\' size=64 value="'.$_GET['site'].'"><input type=\'submit\' value=\'Scan\'></form></center>';
  19. if (substr_replace($_GET['site'],'',3)=='ftp') {
  20. exit('You may not connect to the ftp protocole');
  21. }
  22. if (!isset($_GET['site'])) { exit(''); }
  23.  
  24. $_GET['site']=domain($_GET['site']);
  25.  
  26. function url_exists($durl)
  27. {
  28. // Version 4.x supported
  29. $handle = curl_init($durl);
  30. if (false === $handle)
  31. {
  32. return false;
  33. }
  34. curl_setopt($handle, CURLOPT_HEADER, true);
  35. curl_setopt($handle, CURLOPT_FAILONERROR, true); // this works
  36. curl_setopt($handle, CURLOPT_HTTPHEADER, Array("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15") ); // request as if Firefox
  37. curl_setopt($handle, CURLOPT_NOBODY, true);
  38. curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
  39. $connectable = curl_exec($handle);
  40. curl_close($handle);
  41. if (preg_match('/200 OK/i',substr_replace($connectable,'',30))) {
  42. return true;
  43. } else {
  44. return false;
  45. }
  46. }
  47. //below function will only get links within own domain and not links outside the site.
  48. function getlinks($generateurlf) {
  49. $datac=file_get_contents($generateurlf);
  50. preg_match_all('/(href|src)\=(\"|\')[^\"\'\>]+/i',$datac,$media);
  51. unset($datac);
  52. $datac=preg_replace('/(href|src)(\"|\'|\=\"|\=\')(.*)/i',"$3",$media[0]);
  53. $datab=array();
  54. foreach($datac AS $dfile) {
  55. $generateurle=$generateurlf;
  56. if (!in_array(substr_replace($dfile,'',4),array('http','www.'))) {
  57. if (substr_replace($generateurle,'',0, -1)!=='/') {
  58. $generateurle=preg_replace('/(.*)\/[^\/]+/is', "$1", $generateurle);
  59. } else {
  60. $generateurle=substr_replace($generateurle,'',-1);
  61. }
  62.  
  63. if (substr_replace($dfile,'',1)=='/') {
  64. if (domain($generateurle)==domain($generateurle.$dfile)) {
  65. if (in_array(strtolower(preg_replace('/(.*)[.]([^.\?]+)(\?(.*))?/','$2',basename($generateurle.$dfile))),array('html','htm','xhtml','xml','mhtml','xht','mht','asp','aspx','adp','bml','cfm','cgi','ihtml','jsp','las','lasso','lassoapp','pl','php','php1','php2','php3','php4','php5','php6','phtml','shtml','search','query','forum','blog','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','01','02','03','04','05','06','07','08','09','go','page','file')) || substr($generateurle.$dfile,-1)=='/' || !preg_match('/[\.]/i',basename($generateurle.$dfile))) {
  66. $datab[]=$generateurle.$dfile;
  67. }
  68. }
  69. } else if (substr_replace($dfile,'',1)=='.') {
  70. while (preg_match('/\.\.\/(.*)/i', $dfile)) {
  71. $dfile=substr_replace($dfile,'',0,3);
  72. $generateurle=preg_replace('/(.*)\/[^\/]+/i', "$1", $generateurle);
  73. }
  74. if (domain($generateurle)==domain($generateurle.'/'.$dfile)) {
  75. if (in_array(strtolower(preg_replace('/(.*)[.]([^.\?]+)(\?(.*))?/','$2',basename($generateurle.'/'.$dfile))),array('html','htm','xhtml','xml','mhtml','xht','mht','asp','aspx','adp','bml','cfm','cgi','ihtml','jsp','las','lasso','lassoapp','pl','php','php1','php2','php3','php4','php5','php6','phtml','shtml','search','query','forum','blog','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','01','02','03','04','05','06','07','08','09','go','page','file')) || substr($generateurle.'/'.$dfile,-1)=='/' || !preg_match('/[\.]/i',basename($generateurle.'/'.$dfile))) {
  76. $datab[]=$generateurle.'/'.$dfile;
  77. }
  78. }
  79. } else {
  80. if (domain($generateurle)==domain($generateurle.'/'.$dfile)) {
  81. if (in_array(strtolower(preg_replace('/(.*)[.]([^.\?]+)(\?(.*))?/','$2',basename($generateurle.'/'.$dfile))),array('html','htm','xhtml','xml','mhtml','xht','mht','asp','aspx','adp','bml','cfm','cgi','ihtml','jsp','las','lasso','lassoapp','pl','php','php1','php2','php3','php4','php5','php6','phtml','shtml','search','query','forum','blog','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','01','02','03','04','05','06','07','08','09','go','page','file')) || substr($generateurle.'/'.$dfile,-1)=='/' || !preg_match('/[\.]/i',basename($generateurle.'/'.$dfile))) {
  82. $datab[]=$generateurle.'/'.$dfile;
  83. }
  84. }
  85. }
  86. } else {
  87. if (domain($generateurle)==domain($dfile)) {
  88. if (in_array(strtolower(preg_replace('/(.*)[.]([^.\?]+)(\?(.*))?/','$2',basename($dfile))),array('html','htm','xhtml','xml','mhtml','xht','mht','asp','aspx','adp','bml','cfm','cgi','ihtml','jsp','las','lasso','lassoapp','pl','php','php1','php2','php3','php4','php5','php6','phtml','shtml','search','query','forum','blog','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','01','02','03','04','05','06','07','08','09','go','page','file')) || substr($dfile,-1)=='/' || !preg_match('/[\.]/i',basename($dfile))) {
  89. $datab[]=$dfile;
  90. }
  91. }
  92. }
  93. }
  94. unset($datac);
  95. unset($dfile);
  96. return $datab;
  97. }
  98.  
  99. $loopurl['sites']=array($_GET['site']);
  100. foreach (getlinks($_GET['site']) AS $link) {
  101. if (!in_array($link,$loopurl['sites'])) {
  102. $loopurl['sites'][]=$link;
  103. }
  104. }
  105. unset($link);
  106.  
  107.  
  108.  
  109.  
  110.  
  111.  
  112.  
  113.  
  114.  
  115.  
  116.  
  117.  
  118.  
  119.  
  120.  
  121.  
  122.  
  123.  
  124.  
  125.  
  126.  
  127.  
  128. function generate($genurl) {
  129. $data=file_get_contents($genurl);
  130. //add there what you want to do with the page contents in the variable $data.
  131. }
  132.  
  133.  
  134. for ($loopid=0;isset($loopurl['sites'][$loopid]);$loopid++) {
  135. if (url_exists($loopurl['sites'][$loopid])) {
  136. foreach (getlinks($loopurl['sites'][$loopid]) AS $link) {
  137. if (!in_array($link,$loopurl['sites'])) {
  138. $loopurl['sites'][]=$link;
  139. }
  140. }
  141. unset($link);
  142.  
  143. echo generate($loopurl['sites'][$loopid]);
  144. flush();
  145. }
  146. usleep (5000);
  147. }
  148. echo '<br><b>Bot scan complete.</b></body>';
  149. ?>
And to edit this, just place in the generate() function the code you want performed on each page. Also the only parts that should be modified are after the giant space.
Just a note on the theory behind this. The bot above will scan the website and index all the pages into a database then whenever somebody searches a website that is in the index then it can check the most relevent pages within the selected website. Alternatively you can piggy back off google.
Try not to bump 10 year old threads as it can be really annoying.
Like php then read my website at http://syntax.cwarn23.net/
Star-Trek-Atlantis - now that's what I call a movie ^_^
My favourite PC. - MacGyver Fan
Bad english note: dis-iz-2b4u
Reply With Quote Quick reply to this message  
Join Date: Sep 2008
Posts: 33
Reputation: jyotiu is an unknown quantity at this point 
Solved Threads: 0
jyotiu jyotiu is offline Offline
Light Poster

Re: Enter keyword in search box of any website using php script

 
0
  #5
Apr 4th, 2009
okay let me explain again what i want to do.
I will create a simple php page which will have a input box and a submit button.

this input box will be use to insert any keyword. when a person will hit submit...on the back end... i need a script that will submit/post this keyword to a given websites searchbox/search page.

than i will scrap first results url and return to my page.

does that make sense?
Reply With Quote Quick reply to this message  
Join Date: Sep 2007
Posts: 1,449
Reputation: cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about cwarn23 has a spectacular aura about 
Solved Threads: 135
cwarn23's Avatar
cwarn23 cwarn23 is offline Offline
Nearly a Posting Virtuoso

Re: Enter keyword in search box of any website using php script

 
1
  #6
Apr 4th, 2009
Is this what your looking for? It uses google to search a selected site for selected keywords.
  1. <?php
  2.  
  3. function search($search_term,$site)
  4. {
  5. global $dsite;
  6. $bits = explode('/', $site);
  7. if ($bits[0]=='http:' || $bits[0]=='https:')
  8. {
  9. $site=$bits[0].'//'.$bits[2].'/';
  10. } else {
  11. $site='http://'.$bits[0].'/';
  12. }
  13. $dsite=$site;
  14. $site=urlencode($site);
  15. $search_term=urlencode($search_term);
  16. $curl_handle=curl_init('http://www.google.com.au/search?hl=en&q=site%3A'.$site.'+'.$search_term.'&meta=');
  17. curl_setopt($curl_handle, CURLOPT_HEADER, false);
  18. curl_setopt($curl_handle, CURLOPT_FAILONERROR, true);
  19. curl_setopt($curl_handle, CURLOPT_HTTPHEADER, Array("User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15") ); // request as if Firefox
  20. curl_setopt($curl_handle, CURLOPT_POST, false);
  21. curl_setopt($curl_handle, CURLOPT_NOBODY, false);
  22. curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,4);
  23. curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
  24. $buffer = curl_exec($curl_handle);
  25. curl_close($curl_handle);
  26.  
  27. $bufferb=strip_tags($buffer,'<cite>');
  28. preg_match_all("/<cite>[^ ]+ - [0-9]+k - <\/cite>/",$bufferb,$match['url']);
  29. unset($bufferb);
  30. $match['url'][0]=preg_replace('/<cite>([^ ]+) - [0-9]+k - <\/cite>/','$1',$match['url'][0]);
  31. $bufferb=strip_tags($buffer,'<br><div>');
  32. preg_match_all("/<div[^>]+>[^<]+<br>/",$bufferb,$match['des']);
  33. unset($bufferb);
  34.  
  35.  
  36.  
  37.  
  38.  
  39. $bufferb=strip_tags($buffer,'<a>');
  40. preg_match_all("/<a href=\"[^\"]+\"\ class\=l[^>]+>[^<]+<\/a>/",$bufferb,$match['title']);
  41. $id=0;
  42. while (isset($match['title'][0][$id]))
  43. {
  44. $match['title'][0][$id]=strip_tags($match['title'][0][$id]);
  45. $id+=1;
  46. }
  47.  
  48. $result['url']=$match['url'][0];
  49. $result['des']=$match['des'][0];
  50. $result['title']=$match['title'][0];
  51. unset($match);
  52. unset($buffer);
  53. unset($bufferb);
  54. unset($id);
  55.  
  56. return $result;
  57. }
  58.  
  59. echo "<form method='post' style='margin:0; padding:0;'><table border=0 cellpadding=0 cellspacing=0>
  60. <tr><td align='right'>Website:</td><td><input type='text' size=40 name='site'></td></tr>
  61. <tr><td align='right'>Search Term:</td><td><input type='text' size=40 name='searchval'><input type='submit' value='search'></td></tr></table></form><br>";
  62. if (isset($_POST['searchval']) && strlen($_POST['searchval'])>=1)
  63. {
  64. $result=search($_POST['searchval'],$_POST['site']);
  65. $id=0;
  66. echo "<table border=0 cellspacing=0 cellpadding=0 width=640><tr><td bgcolor='#66CCFF'><table border=0 cellpadding=3 cellspacing=0><tr><td>".
  67. '<b>Website searched: '.$dsite.'<br> There were '.count($result['title'])." results found with the term '<i>".$_POST['searchval']."</i>'</b></td></tr></table></td></tr>";
  68. while (isset($result['url'][$id]) && isset($result['des'][$id]))
  69. {
  70. echo '<tr><td><a href="http://'.$result['url'][$id].'"><font color=#0000FF>'.$result['title'][$id].'</font></a></td></tr><tr><td>'.$result['des'][$id].'</td></tr><tr><td height="16px"></td></tr>';
  71. $id+=1;
  72. }
  73. echo "</table>";
  74. }
  75. ?>
Try not to bump 10 year old threads as it can be really annoying.
Like php then read my website at http://syntax.cwarn23.net/
Star-Trek-Atlantis - now that's what I call a movie ^_^
My favourite PC. - MacGyver Fan
Bad english note: dis-iz-2b4u
Reply With Quote Quick reply to this message  
Join Date: Sep 2008
Posts: 33
Reputation: jyotiu is an unknown quantity at this point 
Solved Threads: 0
jyotiu jyotiu is offline Offline
Light Poster

Re: Enter keyword in search box of any website using php script

 
0
  #7
Apr 4th, 2009
Hi cwarn23
you are my angel

Thanks a ton
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Other Threads in the PHP Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC