Hi,
I am new to python but I want to use it over perl because I think it is a better overall approach to programming. This is a bioinformatics problem.
I have four identically formatted tab delimeted files that have genomic variation data. Each line looks like this.
chr1 11828655 152 uc001ati.1 * R ND ND ND -30 ND ND ND NPPA
and there are four files that have lines that match at a different sub set of attributes. I want to compare all four files and create a master file that has the matching attribute as the new key and lists all the four files with information on if they have this or not. So lets say file 2,3 and 4 have the same [1] item I want to be able to report this.
Does anyone have an idea about how to best approach this problem. I tried line and text manipulation but these seem to be somewhat limited to what I need to do. I want to do this without using a database but it may come to that as well. So please let me know if you think that is the way to go and leave python for this.
Thanks