im a newbie to python still so bare with me if i have gone down the wrong route or my explanation is confusing!!

any ideas on how i can sort my list by the size of the file

at the moment it is getting the size of the file and joining it with the name in to a string.

my problem - i can sort either the name or the size before i add them to the string but once they are sorted i can't work out how to match the file to the size

import os
folder = ('.')
filename=list()
f_size=list()
fsize=list()
for (path, dirs, files) in os.walk(folder):
  for dirs in path:
    for file in files:
      filename.append(os.path.join(path, file))
    for I in filename:
      filesize=str(os.path.getsize(I))
      f_size.append(os.path.join('['+I+' size = '+filesize+']'))
      f_size.sort()
    break
print (f_size)

my list looks like this so far
please don't be fooled in to thinking its already in order because i just happened to pull 4 of the list items which were in order
'[(.\\) of test1.txt) size = 63]',
'[(.\\5) of test1.txt) size = 63]',
'[(.\\6) of test1.txt) size = 63]',
'[(.\\Copy of test2.txt) size = 65]'
any help on where i am going wrong or what i can do to solve this problem would be greatly appreciated.

thanks for you time guys and girls

Recommended Answers

All 4 Replies

Here is a good way to do this. The key points are

  • use generators
  • use list comprehensions
  • use sorted() and it's 'key' argument
# python 2 and 3
import os

# write a generator to decouple the code which creates the data
# from the code which uses it

def gen_triples(folder):
    for (path, dirs, files) in os.walk(folder):
        for file in files:
            name = os.path.join(path, file)
            size = os.path.getsize(name)
            repr = "[{n} size = {s}]".format(n = name, s = size)
            yield name, size, repr # we yield a tuple with 3 values
            
def by_name(triple):
    return triple[0]

def by_size(triple):
    return triple[1]

folder = "."

# use the sorted() builtin function to sort the data

name_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_name)]
size_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_size)]

from pprint import pprint
print("BY NAME")
pprint(name_sorted)
print("BY SIZE")
pprint(size_sorted)

I have a idea!
you can add the fact to a dict!!
let me show you!

#this is a function that sorting filepath by the files size.
#***********************NOTE******************************************#
#it will only return the sorted list of the filepaths, not the file objects.
#the files must be existed! else IOError will be raised.
def SortPathBySize(filepathlist):
    codes = {}
    for fp in filepathlist:
        file_r = open(fp, "r")
        data = file_r.read()
        size = str(data.__len__())
        codes[size] = fp
    new_list = []
    for d in codes:
        new_list.append(codes[d])
    return new_list

and then you have paste the func code, you can sort all files by size!

directory = "C:/Documents and Settings/user/Desktop"
import os.path
files = os.path.listdir(directory)
files = SortPathBySize(files)
for fp in files:
    size = open(fp, "r").read().__len__()
    print "fp: " + fp + " size: " + str(size)

that will work for sorting by size there the smallest begin first.
if you want to sort by name, you can shange

codes=fp

to

codes[fp]=size

if you want to sort by size ther the biggest begin first, you must
encrypt size to 2000000000 - size

i hope you will success!

i have completed it and it works perfectly thanks for all the help guys
i have learned a lot about 'DEF's which i never knew!
here is my final code if anyone else has the same problem and needs some help

#created by dan holding v1.1 (29 nov 2010)
import os,pprint
while True:
  try:
    print('how many files to show?')
    s1=int(input(''))
    break
  except ValueError:
    print ("please type a number")
def gen_triples(folder):
  for (path, dirs, files) in os.walk(folder):
    for file in files:
      if 'Temporary Internet Files' in dirs:
        dirs.remove('Temporary Internet Files')
      name = os.path.join(path, file)
      size = os.path.getsize(name)
      repr = "[{n} SIZE = {s}]".format(n = name, s = size)
      yield name, size, repr 
def by_name(triple):
  return triple[0]
def by_size(triple):
  return triple[1]
folder = "."
size_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_size)]
from pprint import pprint
size_sorted.reverse()
pprint(size_sorted[:s1])

once again many thanks and praise to Gribouillis for the help with 'defs'

I would suggest to change the file name repr to something else as it is one basic Python function and using it as variable is confusing.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.