Smart Searches

Please support our MS SQL advertiser: Intel Parallel Studio Home
Reply

Join Date: Feb 2008
Posts: 22
Reputation: mmxbass is an unknown quantity at this point 
Solved Threads: 0
mmxbass mmxbass is offline Offline
Newbie Poster

Smart Searches

 
0
  #1
Oct 28th, 2009
Simply put, I need a better query. I have a table that contains 85000 rows, each row has a TITLE and an ABSTRACT (they are academic papers).

What I need is to do a realistic SMART search in my query.
Right now I'm doing
  1. SELECT * FROM PsPapers WHERE Title LIKE '%MySearchString%'
which is, as you can see, a fairly retarded search system. This has several problems:
  1. If the search is "othello essays" then my query only gives you ones where the title contains "othello essays" even though something like "essays on othello" I would also want to be returned because the title is OBVIOUSLY on point.
  2. My query doesn't use the abstracts right now, the reason is that the abstracts contain alot of crap and should be WEIGHTED MUCH LESS than titles. I would want title matches to list HIGHER than abstract matches.
  3. There is no real cleverness or weighting here. The older cgi system I'm trying to replace uses code taken from an actual search engine so I need to at least ATTEMPT to do a smart search.

Any thoughts on a better query?
Reply With Quote Quick reply to this message  
Join Date: Feb 2009
Posts: 3,364
Reputation: sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of sknake has much to be proud of 
Solved Threads: 606
Sponsor
sknake's Avatar
sknake sknake is offline Offline
.NET Enthusiast
 
0
  #2
Oct 29th, 2009
Given the search term "othello essays" on the application front end you need to split up the search terms. I think full text indexing offers more options for weighting search results. You may want to take a look at it. Here is one way to help make it better, but only slightly:

  1. IF OBJECT_ID('tempdb..#PsPapers', 'U') IS NOT NULL DROP TABLE #PsPapers
  2. IF OBJECT_ID('tempdb..#Results', 'U') IS NOT NULL DROP TABLE #Results
  3.  
  4. CREATE TABLE #PsPapers
  5. (
  6. PaperId int identity(1000, 1) PRIMARY KEY,
  7. Title varchar(100),
  8. Body varchar(8000)
  9. )
  10.  
  11.  
  12.  
  13. Declare @Search varchar(100), @s1 varchar(100), @s2 varchar(100)
  14. SET @Search = 'othello essays'
  15. SET @s1 = '%othello%'
  16. SET @s2 = '%essays%'
  17.  
  18. INSERT INTO #PsPapers (Title, Body) Values ('othello essays', '')
  19. INSERT INTO #PsPapers (Title, Body) Values ('essays on othello', '')
  20. INSERT INTO #PsPapers (Title, Body) Values ('essays on XXothello', '')
  21. INSERT INTO #PsPapers (Title, Body) Values ('notes on othello', '')
  22. INSERT INTO #PsPapers (Title, Body) Values ('essays on chocolate', '')
  23.  
  24. SELECT *, Cast(1 AS int) AS Relevance
  25. INTO #Results
  26. FROM #PsPapers
  27. WHERE Title = @Search
  28.  
  29. Union ALL
  30.  
  31. SELECT *, Cast(2 AS int)
  32. FROM #PsPapers
  33. WHERE Title LIKE @s1 AND Title LIKE @s2
  34.  
  35. Union ALL
  36.  
  37. SELECT *, Cast(3 AS int)
  38. FROM #PsPapers
  39. WHERE Title LIKE @s1
  40.  
  41. Union ALL
  42.  
  43. SELECT *, Cast(3 AS int)
  44. FROM #PsPapers
  45. WHERE Title LIKE @s2
  46.  
  47. SELECT PaperId, Title, Body, Min(Relevance) AS Relevance
  48. FROM #Results
  49. GROUP BY PaperId, Title, Body
  50. ORDER BY Relevance ASC

Results in:
  1. PaperId Title Body Relevance
  2. ----------- ------------------------- ---------- -----------
  3. 1000 othello essays 1
  4. 1001 essays on othello 2
  5. 1002 essays on XXothello 2
  6. 1003 notes on othello 3
  7. 1004 essays on chocolate 3

You will need to break the search input string on word boundaries and search for each word and weight the results.
Last edited by sknake; Oct 29th, 2009 at 3:37 am.
Scott Knake
Custom Software Development
Apex Software, Inc.
Reply With Quote Quick reply to this message  
Join Date: Nov 2009
Posts: 1
Reputation: daspeac is an unknown quantity at this point 
Solved Threads: 0
daspeac daspeac is offline Offline
Banned
 
-2
  #3
Nov 9th, 2009
Hi mmxbass, do not forget to make a backup of database before your experiments if it is not possible, try the foxpro table repair application
Last edited by happygeek; Nov 10th, 2009 at 6:41 am. Reason: link snipped
Reply With Quote Quick reply to this message  
Reply

Message:


Thread Tools Search this Thread



Tag cloud for MS SQL
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC