Hello All,

Its been a week since I started learning Python, and its been wonderful. This is the problem I am trying to solve.

There are 20 text files (In1.txt, In2.txt,....,In20.txt) in a folder. Each text file has 150000 rows. Each of the 150000 rows has a vector of float values. The number of elements in each row is a variable. Across all the input files, I would like to add the elements by row. I am not sure how to do this. Any help/pointers would be of great value. Thanks

Example:
Input-
In1.txt
L1 - 2.0, 3.0, 4.0, 3.0, 5.0
L2 - 6.0, 3.0, 10.0
L3 -
.
.
L150000 - 15.0

In2.txt
L1 - 2.0, 3.0
L2 - 6.0, 3.0, 10.0, 4.0, 3.0, 5.0
L3 - 7.0
.
.
L150000 - 0.2,10.5

Output-
Output.txt
L1 - 4.0, 6.0, 4.0, 3.0, 5.0
L2 - 12.0, 6.0, 20.0, 4.0, 3.0, 5.0
L3 - 7.0
.
.
L150000 - 15.2, 10.5

  1. read files in loop
  2. for each file split lines containing ' - ':
  • use first value for key to dictionary
  • split and convert second value as list of floats (maybe you should consider numpy array, makes sum easier)
    [ if key is in dictionary add corresponding values iterating and summing zipped old values and current vector, else set the current value as sum vector. ]

Edited 5 Years Ago by pyTony: n/a

pyTony,

Thank you very much for the info. I am able to do the first two steps that you have mentioned. The thing I am not sure is how to add two vectors of variable length as given in the previous post. Thank you!

Neatest way in this case would probably be to use izip_longest from itertools with fill_value zero.

This article has been dead for over six months. Start a new discussion instead.