0

Hello ALL

The project we're working on is a document retrieval system

I have 3 tables as follows. Table 1 contains words from a example doc. Table 2 contains words from documents in a database. Table3 contain only one column ‘DocID’.

I would like to search the database to find those docs similar to the example doc by calculating similarity score between the example doc and each doc in the database. Specifically, the simi score is calculated by adding up the sum of word frequencies of all matched word. Additionally, I am only interested in the those documents whose DocID appear in table3.

Table 1
Word    Freq
Book      2
Desk      3
Pen        3
Board      3


Table2
DocID   Word    Freq
1         Book     3
1         English    2
1          Math     1
2         Desk       2
2         Machine    5
2         Power      2
3         Desk       3
3         Teacher    3
3         Class       2
4         Building    1
4         Tower     2


Table3
DocID
1
2

I am not sure the following statement is correct or not

SELECT
      DocID, table1.Freq+ table2.Freq
FROM
      table1, table2, table3
WHERE
     table1.Word = table2.Word AND  table2.DocID = table3.DocID;
2
Contributors
3
Replies
4
Views
8 Years
Discussion Span
Last Post by dickersonka
0

you'll need to use

Select table3.DocID, sum(table1.Freq + table2.Freq) as Freq
-- your from and where
group by table3.DocID
order by SUM(table1.Freq + table2.Freq) desc

i added the order by for clarity

0

dickersonka thanks for your solution.

I wonder, in the 'order by' clause, if I can use

order by Freq desc

to replace

order by SUM(table1.Freq + table2.Freq) desc
This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.