I am trying to write a script that will traverse through my directory and sub directory and list number of files in a specific size. For example 0kb-1kb: 3, 1kb-4kb:4, 4-16KB: 4, 16kb-64-kb:11 and goes on in multiples of 4. I am able to get list of file numbers, size in human readable format and find number of files in a size group. But i feel my code is very messy and not anywhere near to the standard. Need help in refurbishing the code

`import os
suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
route = raw_input('Enter a location')

def human_Readable(nbytes):
        if nbytes == 0: return '0 B'
        i = 0
        while nbytes >= 1024 and i < len(suffixes)-1:
                nbytes /= 1024.
                i += 1
        f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
        return '%s %s' % (f, suffixes[i])

def file_Dist(path, start,end):
        counter = 0
        counter2 = 0
        for path, subdir, files in os.walk(path):
                for r in files:
                        if os.path.getsize(os.path.join(path,r)) > start and os.path.getsize(os.path.join(path,r)) < end:
                                counter += 1
        print "Number of files greater than %s less than %s:" %(human_Readable(start), human_Readable(end)),  counter
file_Dist(route, 0, 1024)
file_Dist(route, 4096, 16383)
file_Dist(route, 16384, 65535)
file_Dist(route, 65536, 262143)
file_Dist(route, 262144, 1048576)
file_Dist(route, 1048577, 4194304)
file_Dist(route, 4194305, 16777216)`

Since each increment is 4 times the previous, you should be able to divide the size by 1024 and use that. But to use the form you posted, you first want to traverse the directory(s) once only instead of every time the function is called, and store the numbers in a list. This is more straight forward and flexible IMHO, but you will have to decide if you like it better or not.

def update_list(file_size, sizes_list):
    """ return from function when correct size is found
    for ctr in range(len(sizes_list)):
        if file_size < sizes_list[ctr][0]:
            sizes_list[ctr][1] += 1
            return sizes_list
    ## larger than largest test
    return sizes_list

def file_Dist(path, sizes_list):
    for path, subdir, files in os.walk(path):
        for r in files:
            sizes_list=update_list(this_size, sizes_list)
    ## all processing complete
    for size, ctr in sizes_list:
        print "%d to %d = %d" % (previous, size-1, ctr)

for ctr in range(8):
    sizes_list.append([num*1024, 0])  
    num *= 4
print sizes_list

file_Dist(path, sizes_list)
This article has been dead for over six months. Start a new discussion instead.