I have a task to do. I will just brief it in a sentence,
I have a file of 20000 lines. Now task is to recognize number of similar lines.
ex1 : "The quick brown fox jumps on a lazy dog"
ex2 : "The quick brown dog jumps on a very lazy fox"
I want these two sentences as one group. So among 20000 lines i have to recognize how many groups are present.
Time complexity is major issue. Please tell me is there any approach i can go ahead with??