hey guy n girls i hope one of you can help me with this.

plan of action is to : create a python script to...

sort a list of files from a directory into sorted list with most recent modified date at the top.

then I want to get rid of all the duplicate files with the same date.

then I want to pull the top 6 files and delete the rest.

i have worked out how to order the list and sort it,

but when i put it in to a set it still shows all the files how do i tell the set that it needs to work by the date not the name?

# sorts files by modifed date,
# pulls 6 most recent files,
# and delete all others.
#created by dan holding v0.1 (03 Aug 2010)
import os, glob, time, sets
root = '.'
date_file_list = []
date_file_set = []
for folder in glob.glob(root):
    print "folder =", folder
    for file in glob.glob(folder + '\*.txt*'):
# select the type of file, for instance *.bat or all files *.*
        stats = os.stat(file)
        lastmod_date = time.localtime(stats[8])
        date_file_tuple =lastmod_date, file
        date_file_list.append(date_file_tuple)
        date_file_list.sort(reverse = True)     
        # latest modifed date now first
        date_file_set.append(date_file_list)
print (len(date_file_list))
for file in date_file_list:
    folder, file_name = os.path.split(file[1])
    file_date = time.strftime("%m/%d/%y %H:%M:%S", file[0])
     # convert date tuple to MM/DD/YYYY HH:MM:SS format
    print "%-40s %s" % (file_name, file_date)

Recommended Answers

All 2 Replies

File dates are best sorted by the seconds since epoch value. The elimination of exact duplicates (time and path/name) makes not much sense, since they shouldn't even exist ...

# sorts files by modifed date,
# pulls 6 most recent files,
# and delete all others.
#created by dan holding v0.1 (03 Aug 2010)

import os, glob, time, sets

root = '.'

date_file_list = []
for folder in glob.glob(root):
    print "folder =", folder
    # select the type of file, for instance *.bat or all files *.*
    for file in glob.glob(folder + '\*.txt*'):
        stats = os.stat(file)
        # seconds since epoch starting midnight of 1/1/1970
        lastmod_seconds = stats[8]
        # use seconds first to sort the time
        date_file_tuple = lastmod_seconds, file
        date_file_list.append(date_file_tuple)
        # create a set to eliminate duplicate files
        # with matching seconds and path/name
        date_file_set = set(date_file_list)
        # convert set back to a list
        date_file_list = list(date_file_set)
        # latest modified date (in seconds) now first
        date_file_list.sort(reverse = True)
        
# test
print (len(date_file_list))
print (len(date_file_set))
print('-'*70)

# show six most recent files
for lastmod_seconds, path in date_file_list[:6]:
    folder, file_name = os.path.split(path)
    # convert seconds to time tuple
    lastmod_tuple = time.localtime(lastmod_seconds)
    # convert time tuple to MM/DD/YYYY HH:MM:SS format
    file_date = time.strftime("%m/%d/%y %H:%M:%S", lastmod_tuple)
    print "%-40s %s" % (file_name, file_date)

thank you so much i have been banging my head for the last two weeks looking for snipits of code for this project and have not really made any progress but within two hours of posting this its all sorted :)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.