Dear all

I have two text files. file1 with 42000 rows and 6 columns and file2 with 18 rows and 1 column. I want to match the entries in file2 with file1 either in column0 or in column 3. if the entry exist in any column. Write that line to a file.

I have following code - but its printing just one match and not all the occurrences of that entry.

f1 = open('file1.txt')
f2 = open('file2.txt')
f3 = open('file_out.txt', 'w')

d3 = []
d4 = []

testdict = {}

d5 = []

for line in f1:    
    r1 = line.split()
    testdict[r1[0]] = r1[1:] 
        
for line in f2:   
    r2 = line.split()
    d3.append(r2[0])
    

for k,v in testdict.iteritems():
    if k in d3:
        print k, '\t', d3, '\n'
        f3.write("%s\t%s\n"%(k,v))

I think you are working too hard. How about this code?

with open('file1.txt') as f1: # 42 K lines
  with open('file2.txt') as f2: # 18 lines
    with open('file3.txt','w') as fout:
      matches = set()
      for line in f2:
        matches.add(line.strip())
      for line in f1:
        c0,c1,c2,rest = line.split(None,3)
        if c0 in matches or c2 in matches:
          fout.write(line)

I get 48 lines matching.