Hey guys,

Lets say I had two different files, each with a column of data. So:

(File 1)
1
2
3

And

(File 2)
Friend
Foe
Fighter

In the end, I want to unify these into one large file:

1     Friend
2     Foe
3     Fighter

Each file has an identical number of columns, so that is not an issue. In the end, I will be unifying 20 files or so. Preserving the column order is what is tough for me. After getting something like this tiny example to work, the rest should be straight forward.

Recommended Answers

All 5 Replies

Removed my post due to C response but it was a Python question!

A simple, straightforward method would be to iterate over both files simultaneously and write their values into the output file with a tab separating them like so:

file1 = open( file_name1 )
file2 = open( file_name2 )
out_file = open( out_file_name, 'w' )
while 1:
    line1 = file1.readline()
    line2 = file2.readline()
    if not line1 or not line2:
        # Our file iterators are at the end
        break
    out_file.write( '%s\t%s\n' % (line1.strip(), line2.strip()) )
file1.close()
file2.close()
out_file.close()

A simple, straightforward method would be to iterate over both files simultaneously and write their values into the output file with a tab separating them like so:

file1 = open( file_name1 )
file2 = open( file_name2 )
out_file = open( out_file_name, 'w' )
while 1:
    line1 = file1.readline()
    line2 = file2.readline()
    if not line1 or not line2:
        # Our file iterators are at the end
        break
    out_file.write( '%s\t%s\n' % (line1.strip(), line2.strip()) )
file1.close()
file2.close()
out_file.close()

That is a smart solution. What if I had 20 files, would it be possible to tell python to recognize them (in the same directory) and iterate over them? Or do you think it's only possible to manually append the code for line3 line4 etc...?

Give us an example of the filenames, are they distinguishable so they can be grouped?

Give us an example of the filenames, are they distinguishable so they can be grouped?

As of right now, they are named with some scientific jargon:

H2AZ.bed H231K.bed H3K9ac.bed All are .bed files, and almost all start with H. I have some flexibility in these names. For example, I could precede them with a counter number, for example 1-H2AZ.bed, 2-H231K.bed And if I wanted, I could alter the names a bit.

Any suggestions.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.