Suppose we have got ten computers in a sub net , each of which is also connected to internet.
Each PC has lot of documents , user can also access internet and save web page as a document,
Now how can we scan through the sub net to find out all duplicate documents,its location
a brute force way could be to use document name and size, but same content might be stored with different name even size can vary little bit.
Has anyone come across this problem before? Is C++ best language to try to solve this problem ?