954,525 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

newbie needs help! any ideas on how to sort this list???

im a newbie to python still so bare with me if i have gone down the wrong route or my explanation is confusing!!

any ideas on how i can sort my list by the size of the file

at the moment it is getting the size of the file and joining it with the name in to a string.

my problem - i can sort either the name or the size before i add them to the string but once they are sorted i can't work out how to match the file to the size

import os
folder = ('.')
filename=list()
f_size=list()
fsize=list()
for (path, dirs, files) in os.walk(folder):
  for dirs in path:
    for file in files:
      filename.append(os.path.join(path, file))
    for I in filename:
      filesize=str(os.path.getsize(I))
      f_size.append(os.path.join('['+I+' size = '+filesize+']'))
      f_size.sort()
    break
print (f_size)

my list looks like this so far
please don't be fooled in to thinking its already in order because i just happened to pull 4 of the list items which were in order
'[(.\\) of test1.txt) size = 63]',
'[(.\\5) of test1.txt) size = 63]',
'[(.\\6) of test1.txt) size = 63]',
'[(.\\Copy of test2.txt) size = 65]'
any help on where i am going wrong or what i can do to solve this problem would be greatly appreciated.

thanks for you time guys and girls

danholding
Junior Poster in Training
56 posts since Aug 2010
Reputation Points: 15
Solved Threads: 1
 

Here is a good way to do this. The key points areuse generators
use list comprehensions
use sorted() and it's 'key' argument
# python 2 and 3
import os

# write a generator to decouple the code which creates the data
# from the code which uses it

def gen_triples(folder):
for (path, dirs, files) in os.walk(folder):
for file in files:
name = os.path.join(path, file)
size = os.path.getsize(name)
repr = "[{n} size = {s}]".format(n = name, s = size)
yield name, size, repr # we yield a tuple with 3 values

def by_name(triple):
return triple[0]

def by_size(triple):
return triple[1]

folder = "."

# use the sorted() builtin function to sort the data

name_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_name)]
size_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_size)]

from pprint import pprint
print("BY NAME")
pprint(name_sorted)
print("BY SIZE")
pprint(size_sorted)

Gribouillis
Posting Maven
Moderator
2,786 posts since Jul 2008
Reputation Points: 1,044
Solved Threads: 691
 

I have a idea!
you can add the fact to a dict!!
let me show you!

#this is a function that sorting filepath by the files size.
#***********************NOTE******************************************#
#it will only return the sorted list of the filepaths, not the file objects.
#the files must be existed! else IOError will be raised.
def SortPathBySize(filepathlist):
    codes = {}
    for fp in filepathlist:
        file_r = open(fp, "r")
        data = file_r.read()
        size = str(data.__len__())
        codes[size] = fp
    new_list = []
    for d in codes:
        new_list.append(codes[d])
    return new_list


and then you have paste the func code, you can sort all files by size!

directory = "C:/Documents and Settings/user/Desktop"
import os.path
files = os.path.listdir(directory)
files = SortPathBySize(files)
for fp in files:
    size = open(fp, "r").read().__len__()
    print "fp: " + fp + " size: " + str(size)


that will work for sorting by size there the smallest begin first.
if you want to sort by name, you can shange

codes[size]=fp

to

codes[fp]=size

if you want to sort by size ther the biggest begin first, you must
encrypt size to 2000000000 - size

i hope you will success!

emorjon2
Newbie Poster
23 posts since Sep 2010
Reputation Points: 10
Solved Threads: 1
 

i have completed it and it works perfectly thanks for all the help guys
i have learned a lot about 'DEF's which i never knew!
here is my final code if anyone else has the same problem and needs some help

#created by dan holding v1.1 (29 nov 2010)
import os,pprint
while True:
  try:
    print('how many files to show?')
    s1=int(input(''))
    break
  except ValueError:
    print ("please type a number")
def gen_triples(folder):
  for (path, dirs, files) in os.walk(folder):
    for file in files:
      if 'Temporary Internet Files' in dirs:
        dirs.remove('Temporary Internet Files')
      name = os.path.join(path, file)
      size = os.path.getsize(name)
      repr = "[{n} SIZE = {s}]".format(n = name, s = size)
      yield name, size, repr 
def by_name(triple):
  return triple[0]
def by_size(triple):
  return triple[1]
folder = "."
size_sorted = [ repr for (name, size, repr) in sorted(gen_triples(folder), key = by_size)]
from pprint import pprint
size_sorted.reverse()
pprint(size_sorted[:s1])

once again many thanks and praise to Gribouillis for the help with 'defs'

danholding
Junior Poster in Training
56 posts since Aug 2010
Reputation Points: 15
Solved Threads: 1
 

I would suggest to change the file name repr to something else as it is one basic Python function and using it as variable is confusing.

pyTony
pyMod
Moderator
5,359 posts since Apr 2010
Reputation Points: 782
Solved Threads: 852
 

This question has already been solved

Post: Markdown Syntax: Formatting Help
You
View similar articles that have also been tagged: