I have over 10 text files, each file has exactly 2671 floats e.g.

1.232124234 #line 1
2.324234323 #line 2
.
.
1.324234234 # line 2671

I would like to the add together the floats on each line with the float on the corresponding line for each of the 10 files, e.g. "file 1 line 1 float" + "file 2 line 1 float" +.....+ "file 10 line 1 float". I need to do this for each line in the text files and print the sum to a csv on 2671 lines.

I can achieve using the following very long code, but looking for a more efficient way of achieving the same outcome. As I will not always have the same number of text files for each time I do this. It could over 50 at times.

All of the files are labelled in number order as shown in the code below, which I think will help with this problem.

Here is my long version:

SumFile = open("SumFile.csv", "w")
for s1, s2, s3, s4, s5, s6, s7, s8, s9, s10\
    in zip(open('file1.txt', 'r'), open('file2.txt', 'r'), \
    open('file3.txt', 'r'), open('file4.txt', 'r'), open('file5.txt', 'r'), open('file6.txt', 'r'), \
    open('file7.txt', 'r'), open('file8.txt', 'r'), open('file9.txt', 'r'), open('file10.txt', 'r')):
    Sum = float(s1) + float(s2) + float(s3) + float(s4) + float(s5) + float(s6) +\
    float(s7) + float(s8) + float(s9) + float(s10)
    SumFile.write(('%6.20f,\n') % (Sum))

I really would appreciate you help and suggestions.

Thank-you.

Recommended Answers

All 8 Replies

Well, you could try this:

def sumFile():
    sums = 0.0
    files = ['a1.txt', 'a2.txt', 'a3.txt', 'a4.txt', 'a5.txt'] #here are your files
    for i in files:
        elems = (line.strip('\n') for line in open(i).readlines())
        for j in elems:
            try: sums+=float(j)
            except ValueError as e: pass
    print sums
sumFile()

You have only one sum Lucaci, so I do not think that will work.

num_files = 3
base = 'text%i.txt'
with open("SumFile.csv", "w") as sf:
        sf.write(',\n'.join('%6.20f' % sum(float(num.strip()) for num in lines)
                            for lines in zip(*(open(base % n, 'r').readlines()
                                  for n in range(1, num_files + 1))
                                )
                            )
                 )


print open('SumFile.csv').read()

Thanks very much Lucaci Andrew, the problem is that I would still need to write out the names of all the text files and if I had 50 text files, this would take a while.

Any more suggestions would be much appreciated, thanks!

You can do the trick you did before with elems = (line.strip('\n') for line in open(i).readlines()) but applied to your files:

files = (line.strip('\n') for line in open("files.txt").readlines())
for i in files:
    elems = (line.strip('\n') for line in open(i).readlines())
        #rest of the code

Haven' tested it yet... but you can create a file called "files.txt" in which you could put all your custom files, for example:

#file.txt
a1.txt
a2.txt
a3.txt
a4.txt
#etc.

each file on a separate line for convenience, and with each iteration, you would open the file located at that point, and do your work. It's just a sugestion, try it out.

You can try this,glob will iterate over all your file..txt
I use fileinput module to open all files.
Then calulate two sum based on enumerate even/odd.

import fileinput
from glob import glob

sum_file = open("SumFile.csv", "w")
total = 0.0
total_1 = 0.0
fnames = glob('file*.txt')
for numb,line in enumerate(fileinput.input(fnames)):
    if numb % 2 == 0:
        total += float(line.strip())
    else:
        total_1 += float(line.strip())
sum_file.write('%6.20f,\n%6.20f,\n' % (total, total_1))
sum_file.close()

Would it be easier if the "file1.txt", file2.txt", etc were .csv files? Would this helping the coding? Thanks

I do not understand how would those differ, you have only one value per line, don't you.

Would it be easier if the "file1.txt", file2.txt", etc were .csv files? Would this helping the coding? Thanks

No it would not help,the challenge is the same if you need to multiply vaules from difference files.
The only difference is \n or ,
Look at this,if use spilt() on both the output list is the same.

>>> s = '''111.11
    222.22'''
>>> s
'111.11\n222.22'
>>> s.split()
['111.11', '222.22']

>>> csv = '111.11, 222.22'
>>> csv
'111.11, 222.22'
>>> csv.split(',')
['111.11', ' 222.22']

>>> sum(float(i) for i in s.split())
333.33    
>>> sum(float(i) for i in csv.split(','))
333.33
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.