I am comparing 2000 files with one other file. I want the program to go through each line in both files and compare. If the line is present, then it has to write to another file. What I tried was to open both the files and use readlines() to read into an list. Then I used for loop like this:
chain_sep= complex_file=open ("1complex.txt", "r") complex_lines = complex_file.readlines() complex_lines = map(string.strip, complex_lines) splitter = [s.split('\t') for s in complex_lines] complex_file.close() for file in os.listdir("."): basename=os.path.basename(file) if basename.endswith(".pd"): chain_sep.append(basename) for (i,s) in izip(chain_sep,splitter): fhandle_6 =open (i, "r") from_pd = fhandle_6.readlines() from_pd = map(string.strip,from_pd) fhandle_6.close() fhandle_13 = open(s+".cr", 'r') fhandle_13_l = fhandle_13.readlines() fhandle_13_l = map(string.strip, fhandle_13_l) fhandle_13.close() fopen_7=open (i+"r.pdb", "w") fopen_8=open (i+"l.pdb", "w") for (a,y) in izip(from_pd,fhandle_13_l): #from_pd and fhandle_13_l is not of the same length :( if a[0:4]=="ATOM": if a == "R": print >>fopen_7, a else: if a[7:13]==y[7:13]: print >>fopen_8, a fopen_7.close() fopen_8.close()
The above code is only a chunk btw. My problem is that both the files are not of the same size so I feel using zip or izip is not ideal in this situation. A part or the file I have to deal with is below:
file-1 ATOM 2197 [b]CB CYS I 51[/b] 38.091 -13.002 6.320 1.00 20.12 ATOM 2198 [b]SG CYS I 51[/b] 39.781 -12.827 5.691 1.00 26.67 ATOM 2199 [b]N MET I 52[/b] 37.845 -15.766 5.722 1.00 33.08 ATOM 2200 [b]CA MET I 52[/b] 38.312 -17.144 5.674 1.00 33.08
file-2 ATOM 2197 [b]O ASP L 50[/b] 18.653 89.329 84.802 1.00 0.00 ATOM 2198 [b]CB ASP L 50[/b] 16.004 87.278 84.523 1.00 0.00 ATOM 2199 [b]CG ASP L 50[/b] 15.349 86.109 85.277 1.00 0.00 ATOM 2200 [b]OD1 ASP L 50[/b] 15.347 85.935 86.514 1.00 0.00
The only part that is common to both files is the one in bold (the above is just a chunk of a code). So ideally I am supposed to compare the bold data from file 1 and if it exists in file 2, I have to retain it and remove the remaining data.
[b]CB CYS I 51[/b] [b]CB CYS I 51[/b]
If the above entry is there in both files then I gotto retain it in file-2 and remove all other entries. I tried to add the required list position to the sample code you gave me but I failed to get the results. Please let me know if I can differentiate the above data and if so how can I do it? I tried the same in perl and I am able to do it very easily but the same in python is becoming tougher for me as I am very new to python (learning for the past week or so)