I am new to SQL..
I have a table with 200 rows and 100 columns..
I want to extract unique rows from this table such that no two row are more than 60% similar.. this 60% is a variable and would be user defined..
I have attached an example file with the raw data.. I have done it in excel.. my logic goes as follows:
1st I compare Row 2 with Row 1 and find out its relevancy.. If it is more than what is desired I discard this row else I add it to an array.. I move on to the next row and compare it with all the rows present in the array.. if the existing row is similar to any of the row present in the array then it is ruled out else it is added into the array..
I actually have to perform this test on a data containing more than 65000 rows and 200 columns so I need an SQL query that will make my work faster..
Thanks for your help..