943,929 Members | Top Members by Rank

Ad:
  • Python Discussion Thread
  • Marked Solved
  • Views: 2006
  • Python RSS
Oct 29th, 2009
0

Find largest file in directory

Expand Post »
I'm working on a cleanup script for tv shows that I download. Right now I'm just looking for a file greater than 50mb, but there should be a better way.

Python Syntax (Toggle Plain Text)
  1. import os
  2. import shutil
  3.  
  4. dir = "C:\Users\Bobe\Downloads\TV\\"
  5.  
  6. for folder in os.listdir(dir):
  7. if os.path.isdir(os.path.join(dir,folder)):
  8. for file in os.listdir(dir + folder):
  9. filelocation = dir+folder+"\\"+file
  10. if os.path.getsize(filelocation) > 50000000:
  11. shutil.move(filelocation, dir + folder + ".avi")
  12. else:
  13. os.remove(filelocation)
  14.  
  15. shutil.rmtree(dir + folder)
Last edited by keyoh; Oct 29th, 2009 at 3:38 pm.
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
keyoh is offline Offline
2 posts
since Oct 2009
Oct 31st, 2009
0
Re: Find largest file in directory
What is your problem with the code you have?
Moderator
Reputation Points: 1333
Solved Threads: 1403
DaniWeb's Hypocrite
vegaseat is offline Offline
5,792 posts
since Oct 2004
Oct 31st, 2009
0
Re: Find largest file in directory
I feel finding a file greater than 50mb is kind of a hack fix. Is there a simple way to return the filename of the largest file in a directory?
Reputation Points: 10
Solved Threads: 0
Newbie Poster
keyoh is offline Offline
2 posts
since Oct 2009
Oct 31st, 2009
0
Re: Find largest file in directory
OK, check this out. The code you're using to find files is pretty good. If you're looking to improve that part of the code you could use recursion so that your function will descend into subdirectories and pull out those files too. Here is some code I whipped up that will print the number of files in all the subdirectories under a folder.
python Syntax (Toggle Plain Text)
  1. import os
  2.  
  3. # Windows and linux slashes go in opposite directions.
  4. # Uncomment the slash appropriate for your system.
  5. systemslash='/'
  6. # systemslash='\'
  7.  
  8. def get_list_of_files(inDirectory, container=[]):
  9. for entry in os.listdir(inDirectory):
  10. if os.path.isdir(inDirectory+systemslash+entry):
  11. get_list_of_files(inDirectory+systemslash+entry,container)
  12. container.append(inDirectory+systemslash+entry)
  13. return container
  14.  
  15. Final_List_of_Files = get_list_of_files('/Users/kevin')
  16. print len(Final_List_of_Files)

Now you want to get the list of files and the file size, so I would suggest putting them into a list of tuples which you can sort to get the biggest file of them all. Change the line where we add the file name to the list so that it adds a tuple containing the file name and size.
python Syntax (Toggle Plain Text)
  1. filesize = os.path.getsize(inDirectory+systemslash+entry)
  2. fileandsize = (filesize, inDirectory+systemslash+entry)
  3. container.append(fileandsize)

Then your last task is to sort the list of tuples with Final_List_of_Files.sort(). You'll have to reverse the sort order so that you can the largest file in the top position. Here is the final code
python Syntax (Toggle Plain Text)
  1. import os
  2.  
  3. # Windows and linux slashes go in opposite directions.
  4. # Uncomment the slash appropriate for your system.
  5. systemslash='/'
  6. # systemslash='\'
  7.  
  8. def get_list_of_files(inDirectory, container=[]):
  9. for entry in os.listdir(inDirectory):
  10. entry = inDirectory+systemslash+entry
  11. if os.path.isdir(entry):
  12. get_list_of_files(entry,container)
  13. filesize = os.path.getsize(entry)
  14. fileandsize = (filesize, entry)
  15. container.append(fileandsize)
  16. return container
  17.  
  18. Final_List_of_Files = get_list_of_files('/Users/kevin/Documents')
  19. Final_List_of_Files.sort(reverse=True)
  20.  
  21. print Final_List_of_Files[0]
Reputation Points: 16
Solved Threads: 35
Junior Poster
mn_kthompson is offline Offline
148 posts
since Nov 2007
Oct 31st, 2009
0
Re: Find largest file in directory
Well, you could do something like this:
python Syntax (Toggle Plain Text)
  1. # File_lister2.py
  2. # create a list of all the files and sizes in a given direcory
  3. # and optionally any of its subdirectories (Python2 & Python3)
  4. # snee
  5.  
  6. import os
  7.  
  8. def file_lister(directory, subs=False):
  9. """
  10. returns a list of (size, full_name) tuples of all files
  11. in a given directory
  12. if subs=True also any of its subdirectories
  13. """
  14. mylist = []
  15. for fname in os.listdir(directory):
  16. # add directory to filename for a full pathname
  17. full_name = os.path.join(directory, fname)
  18. # size in kb
  19. size = int(os.path.getsize(full_name)//1024) + 1
  20. if not os.path.isdir(full_name):
  21. # append a (size, full_name) tuple
  22. mylist.append((size, full_name))
  23. elif subs==True:
  24. # optionally recurse into subdirs
  25. file_lister(full_name)
  26. return mylist
  27.  
  28. #dir_name = r"C:\Python31\Tools" # Windows
  29. dir_name = "/home/dell/Downloads" # Linux
  30. file_list = file_lister(dir_name)
  31.  
  32. # show the list sorted by size
  33. for file_info in sorted(file_list, reverse=True):
  34. print(file_info)
  35.  
  36. print('-'*66)
  37.  
  38. print( "The largest file is: \n%s (%skb)" % \
  39. (max(file_list)[1], max(file_list)[0]) )
  40.  
  41. """a typical partial output -->
  42. (24144, '/home/dell/Downloads/ActivePython-2.6.2.2-linux-x86.tar.gz')
  43. (23320, '/home/dell/Downloads/ActivePython-3.1.0.1-linux-x86.tar.gz')
  44. (9288, '/home/dell/Downloads/Python-3.1.tar.bz2')
  45. ...
  46. ...
  47. ------------------------------------------------------------------
  48. The largest file is:
  49. /home/dell/Downloads/ActivePython-2.6.2.2-linux-x86.tar.gz (24144kb)
  50. """
Reputation Points: 961
Solved Threads: 211
Nearly a Posting Maven
sneekula is offline Offline
2,413 posts
since Oct 2006
Nov 1st, 2009
1
Re: Find largest file in directory
Arrr, too much code noise! ;-)

If you know you have at least one file:
Python Syntax (Toggle Plain Text)
  1. import os, glob
  2. largest = sorted( (os.path.getsize(s), s) for s in glob.glob('yourdir/*.avi') )[-1][1]

If not, you split the code a bit:
Python Syntax (Toggle Plain Text)
  1. import os, glob
  2. files = glob.glob('yourdir/*.avi')
  3. largest = sorted((os.path.getsize(s), s) for s in files)[-1][1] if files else ''
  4. if largest:
  5. ... # do something with it
Last edited by pythopian; Nov 1st, 2009 at 7:37 pm.
Reputation Points: 20
Solved Threads: 25
Junior Poster in Training
pythopian is offline Offline
81 posts
since Nov 2009

This thread is solved

Either the thread starter or a moderator has marked this thread as solved. You can most likely trust the responses and answers given. There is most likely no reason for any further responses to be posted here. If you have a related question, please start a new thread in this forum instead.

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Python Forum Timeline: How to generate HTML pages from a text file using python??
Next Thread in Python Forum Timeline: wxPython text overwrite problem...





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC