I'm thinking about writing a small command line directory management program in Python. The most basic functionality I want is to be able to compare the the files in a directory and sub-directory, and if there are any duplicates then delete the lower most duplicate.

I all ready have a basic idea of how to accomplish that. Merely hash both files and compare the hash values. If the hash values are equivalent, then they are the same. However, I'm not sure on the specifics of how to accomplish it. I know the data has to be read from each file and then hashed to do the comparison. However, what is the best method for reading the data? I'll be dealing with large files, possibly upwards of 4GB, so the program needs to be able to handle this.

What are some other good tools to include? I'm thinking about writing some file renaming tools as well.

Recommended Answers

All 3 Replies

The scons program uses the same method of comparing the hash values of files to determine if they have been modified. Perhaps you could have a look in scons source code.

Thank you for the response. I've located the scons website and bookmarked it. I'll have a look later. Can you think of anything in particular you'd want in a directory and file management program?

You should also look in the 'midnight commander' family. I'm using gnome-commander in linux, but there are similar programs for all OSes. I think the ancestor was a windows program called Norton commander.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.