Hello everybody,

I am Saurav Saha from India. I am not aware of python but just because i am working on a tool that is on python i got struck. I have many directories in which two files are kept 1. *.txt and 2. *.pdbqt

I wanted a program that enters into all the directory then reads pdbqt files then take second line (which is a number)in memory and gives me the name of pdbqt files with least second line. From that software forum i got this program:

#! /usr/bin/env python
import sys
import glob

def doit(n):
file_names = glob.glob('*/*.pdbqt')
everything = []
for file_name in file_names:
file = open(file_name)
lines = file.readlines()
file.close()
line = lines[1]
result = float(line.split(':')[1].split()[0])
everything.append([result, file_name])
everything.sort(lambda x,y: cmp(x[0], y[0]))
part = everything[:n]
for p in part:
print p[1],
print

if __name__ == '__main__':
doit(int(sys.argv[1]))

when i used in one dataset of some 100-200 directories it is working fine but when i m using it for 6000 directory it is giving error:

saurav@ubuntu:~/Desktop/ppk_results/final_done_against_pubchem/done$ ./vina_screen_get_top.py 10
Traceback (most recent call last):
File "./vina_screen_get_top.py", line 22, in <module>
doit(int(sys.argv[1]))
File "./vina_screen_get_top.py", line 12, in doit
line = lines[1]
IndexError: list index out of range


Please help

Hello,
The error means that one of your files has less than 2 lines. Here is a modified version which prints those files

#! /usr/bin/env python
import sys
import glob

def doit(n):
    file_names = glob.glob('*/*.pdbqt')
    everything = []
    for file_name in file_names:
        file = open(file_name)
        lines = file.readlines()
        file.close()
        try:
            line = lines[1]
        except IndexError:
            print "file %s has only %d lines" % (file_name, len(lines))
        else:
            result = float(line.split(':')[1].split()[0])
            everything.append([result, file_name])
    everything.sort(lambda x,y: cmp(x[0], y[0]))
    part = everything[:n]
    for p in part:
        print p[1],
    print

if __name__ == '__main__':
    doit(int(sys.argv[1]))

Please use code tags to post python code, see here http://www.daniweb.com/forums/announcement.php?f=8&announcementid=3 .

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.