954,604 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

OR search Problems

Hi all,
I am developing a search engine as project in PHP and MySQL as backend . I have implemented crawler , now i am facing problems in searcher . When a user enters any query by default i want to accept it as OR Query i.e. I want to display results containing all keywords in the beginning ( [assuming 4 keywords search] desc order of weight) then after that next results to be displayed would be links containing any of three keywords searched , same in desc order of weights and so on and so forth . till i don't display single keywords records ..
I am tired by thinking logic of this problem , for first displaying all keywords i have solved it but next problem i m unable to think .. can you give some ideas? please help!

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Can you perhaps give an example of what you have, and of what you want ?

If I understand correctly, you just want a way to order your results on relevancy.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

pritaeas You got it correctly I will give an example.
Suppose user enters query addition of two numbers in servlet , now as a search engine , I am not considering of in so I am left with 4 keywords.
Addition Two Numbers Servlet
Next step is I will scan my database and find out the results. I have 15 tables, i do MD5 and save keywords , table names are from kwd0 to kwdf along with Keyword id and weight in a link. Keyword is another table which consists of actual Keywords.
Now What i want is !
First run a query to get all sites containing all the 4 searched keywords. So when i find all those sites, I will add all the weights of respective keywords , one with highest weight will be on TOP .
But this is Like Simple AND search . Now As i want to implement OR , next thing i have to consider 3 keywords , any3 keywords in site, find weights, arrange them . Same with 2 keywords , 1 keyword and so on and arrange them.
And then finally my complete result will be there which I want to display to user!! So now How should i solve this problem? can you give me some ideas !! Well number of records will be atleast in 1000s so I want to implement this thing in Optimum way so that it will be fast and giving good results!
Thanks in advance.

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Is it not possible to do an OR search with your query, let that calculate the weight in the result for each keyword, and then group it by article and summing the weight. Then you can order by the summed weight descending, without having to run X queries.

What I know from weighted keywords, is that it is not always true that having all keywords will outrank any other (depending on your weighting system).

Could you show me the query for a search on a single table ?

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 
while (($words < count($wordarray)) && $possible_to_find == 1) { // Checking whether //we are still having words left and getting results
$wordmd5 = substr(md5($searchword), 0, 1);

			$query1 = "SELECT distinct link_id, weight, domain from ".$mysql_table_prefix."link_keyword$wordmd5, ".$mysql_table_prefix."keywords where ".$mysql_table_prefix."link_keyword$wordmd5.keyword_id= ".$mysql_table_prefix."keywords.keyword_id and keyword='$searchword' $domain_qry order by weight desc";
	$result = mysql_query($query1);

			$num_rows = mysql_num_rows($result);

			if ($num_rows == 0) {

				if ($type != "or") {

					$possible_to_find = 0;

					break;

				}

			}
$indx = $words; // FOR AND SEARCH
	while ($row = mysql_fetch_row($result)) {	

				$linklist[$indx]['id'][] = $row[0];

				$domains[$row[0]] = $row[2];

				$linklist[$indx]['weight'][$row[0]] = $row[1];

				
			}

			$words++;
}

$j= 1;

				$min = 0;

				while ($j < $words) {

				if (count($linklist[$min]['id']) > count($linklist[$j]['id'])) {

						$min = $j;

					}

					$j++;

				}

			
				$j = 0;
//ABOVE Code for finding Least number results from a query to build AND search
$temp_array = $linklist[$min]['id'];
$count = 0;

				while ($j < count($temp_array)) {

					$k = 0; //and word counter

					$n = 0; //not word counter

					$o = 0; //phrase word counter

					$weight = 1;

					$break = 0;

					while ($k < $words && $break== 0) {

						if ($linklist[$k]['weight'][$temp_array[$j]] > 0) {

							

							$weight = $weight + $linklist[$k]['weight'][$temp_array[$j]];

							//echo $linklist[$k]['weight'][$temp_array[$j]] ." ". $weight . "";

						} else {

							$break = 1;

						}

						$k++;

					}

					while ($n < $not_words && $break== 0) {

						if ($notlist[$n]['id'][$temp_array[$j]] > 0) {

							$break = 1;

						}

						$n++;

					}				



					while ($o < $phrase_words && $break== 0) {

						if ($phraselist[$n]['id'][$temp_array[$j]] != 1) {

							$break = 1;

						}

						$o++;

					}

					if ($break== 0 && $category > 0 && $category_list[$temp_array[$j]] != 1) {

						$break = 1;

					}



					if ($break == 0) {

						$result_array_full[$temp_array[$j]] = $weight;

						$count ++;

					}

					$j++;

				}

			}

		}


$query1 = "SELECT distinct link_id, url, title, description,  $fulltxt, size FROM ".$mysql_table_prefix."links WHERE link_id in ($inlist)";



		$result = mysql_query($query1);

		echo mysql_error();



		$i = 0;

		while ($row = mysql_fetch_row($result)) {

			$res[$i]['title'] = $row[2];

			$res[$i]['url'] = $row[1];

			if ($row[3] != null && $show_meta_description == 1)

				$res[$i]['fulltxt'] = $row[3];

			else 

				$res[$i]['fulltxt'] = $row[4];

			$res[$i]['size'] = $row[5];

			$res[$i]['weight'] = $result_array[$row[0]];

			$dom_result = mysql_query("select domain from ".$mysql_table_prefix."domains where domain_id='".$domains[$row[0]]."'");

			$dom_row = mysql_fetch_row($dom_result);

			$res[$i]['domain'] = $dom_row[0];

			$i++;

		}







		if ($merge_site_results  && $domain_qry == "") {

			sort_with_domains($res);

		} else {

			usort($res, "cmp"); 	

		}

		echo mysql_error();

		$res['maxweight'] = $maxweight;

		$res['results'] = $results;

		return $res;

Something like this .. i am using for Returning AND results! I know I couldn't paste everything from code.. i tried to post as much was possible and essential ..

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

You should be able to do an OR search, and then GROUP BY link_id and SUM(weight).

SELECT link_id, SUM(weight) AS weight, domain 
FROM yourtables
WHERE your keywords filter with OR instead of AND
GROUP BY link_id


Perhaps you can try it in PhpMyAdmin (or another tool you use), to see if it gets results. Anything you can move from code to query, will speed things up.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

Thank you bro.. I will try it now.. and let you know how it works!! :) thanks a lot!! I will update this post as soon as i feel i got success ( cause you !!! )

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

This kind of queries I am using currently

SELECT distinct link_id, weight, domain from link_keywordb, keywords where link_keywordb.keyword_id= keywords.keyword_id and keyword='two' order by weight desc


If I try to use

SELECT link_id, SUM(weight) AS weight
FROM  link_keyword3,link_keywordb,keywords
WHERE ((link_keyword3.keyword_id= keywords.keyword_id and keyword='add')and(link_keywordb.keyword_id= keywords.keyword_id and keyword='two')) 
GROUP BY link_id

something like above , i get error of unambiguity . I am totally fed up and unable to think :( ! since past 10 days trying to solve this problem .. If you could help me further I will be really thankful

I am posting schema of my db , so that it can be useful

link_keyword0 to link_keywordf        link_id | keyword_id | weight | domain
links                                 link_id|site_id|url_title
keywords                              keyword_id|keyword


Thanks in advance!

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

You get that because link_id appears in more tables. So in the SELECT and GROUP BY you need to specify which one you mean, just like you did with keyword_id in the WHERE.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

I have one doubt.. now suppose I find a all keywords in same link, then how can I add it here if I use Above queries? will it be possible

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Not sure what you mean. The GROUP BY will merge the results to one link_id, just as your DISTINCT did earlier, but it will SUM all the weights for that link_id.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

I tried to run many queries but none is working as i want .. I am posting one of example how I tried

SELECT link_id, SUM(weight) AS weight, domain 
FROM link_keyword3,keywords,link_keywordb
WHERE ((SELECT distinct link_id, weight, domain from link_keyword3, keywords where link_keyword3.keyword_id= keywords.keyword_id and keyword='add' order by weight desc) and (SELECT distinct link_id, weight, domain from link_keywordb, keywords where link_keywordb.keyword_id= keywords.keyword_id and keyword='two' order by weight desc ))
GROUP BY link_id

Error
#1052 - Column 'link_id' in field list is ambiguous
I am getting this kind of error ! Please help me to fix it!! I am really in huge trouble and out of mind!! tried a lot :'(

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Put the tablename and a dot in front of it.

link_keyword3.link_id for example.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

Am i doing it right? .. but putting that tablename also not working :'(

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Perhaps you forgot one.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

What should i write for 1st one.. because what i want is result of

WHERE ((SELECT distinct link_id, weight, domain from link_keyword3, keywords where link_keyword3.keyword_id= keywords.keyword_id and keyword='add' order by weight desc) and (SELECT distinct link_id, weight, domain from link_keywordb, keywords where link_keywordb.keyword_id= keywords.keyword_id and keyword='two' order by weight desc ))


these should be used by select query! now how do that :| i am totally confused!

I want to use result of the queries in WHERE clause in Select query!

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

Pooh, this is getting difficult to do from the top of my head now. I don't think the WHERE should contain a SELECT like that.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

did u understand my schema? so can you show me an example query on that schema .. for this kind of situation? so that i might get a much better idea then :S
I mean try to use AND for that kind of situation .. I shown you queries which i m using currently .. I want to convert them totally into query way .. into single query so that i don't need to do coding as such !

SELECT distinct link_id, weight, domain from link_keyword3, keywords where link_keyword3.keyword_id= keywords.keyword_id and keyword='add' order by weight desc

SELECT distinct link_id, weight, domain from link_keywordb, keywords where link_keywordb.keyword_id= keywords.keyword_id and keyword='two' order by weight desc

these 2 are example queries.. How do i merge them to get results from them as AND directly through a query!
I hope it becomes simpler for you now..

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

I could use some example data with that, just a small example of what you have (5 records or so). Then I will try tonight.

pritaeas
Posting Expert
Moderator
5,484 posts since Jul 2006
Reputation Points: 653
Solved Threads: 875
 

can you give me your contact info email or something? I will forward you my DATABASE on that.. how about that?

mandy011
Light Poster
31 posts since Jul 2008
Reputation Points: 10
Solved Threads: 1
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You