1,105,633 Community Members

Reading column in a text file

Member Avatar
pythonbegginer
Newbie Poster
8 posts since Jul 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

I have a text file named "multipoles.txt", i took a screen shot of it ( http://img193.imageshack.us/i/textf.jpg/) so i can explain myself better. Ok so I basically want to go in the file and get the data from the third column below the text that says Electronic Charge Electrons. The numbers in the columns will not be the same all the time seeing as this is an output file that depends on parameters for a calculation. How can i set python to read that specific column and pass back the data on it...and create another text file with just that column?....The column goes way further down too.

Member Avatar
hughesadam_87
Posting Whiz in Training
274 posts since May 2009
Reputation Points: 54 [?]
Q&As Helped to Solve: 13 [?]
Skill Endorsements: 1 [?]
 
0
 

There are several ways. The easiest way is to use a python list. The more complicated way is to use a dictionary. The advantage of the dictionary is that if later down the road, you need to retrieve some of the original information (for example, the entire line that corresponded to the entry of interest), it is more accessible.

For the simple case of just checking a column:

import re

infile = open ('your_file_name', 'r')
outfile = open('output_file_name', 'w')
column = ??? (PUT YOUR COLUMN HERE)

for line in file:
     if not re.match('#', line):     


          line = line.strip()
          sline = line.split()
          outfile.write(sline[column] + '\n')

infile.close()
outfile.close()

Notice that I chose sline[column] to signify which column I want. In principle, this will do it, but as I mentioned before, if you later need to retrieve corolary information from your original data, it is harder to retrieve unless you use dictionaries.

If you find that you have to do this type of manipulation often, I have a ton of tools/ideas which can help as this line of work is exactly what I've been doing all summer (expect w/ bio data).

Member Avatar
pythonbegginer
Newbie Poster
8 posts since Jul 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Thanks for the help.

I won't need to retrieve the original data. Python will go in through the same file and just grab that column everytime.

I tried your code and it tells me that there is an error in the "for line in file:" line...telling me that there is TypeError: iteration over non-sequence.

Member Avatar
hughesadam_87
Posting Whiz in Training
274 posts since May 2009
Reputation Points: 54 [?]
Q&As Helped to Solve: 13 [?]
Skill Endorsements: 1 [?]
 
0
 

Thanks for the help.

I won't need to retrieve the original data. Python will go in through the same file and just grab that column everytime.

I tried your code and it tells me that there is an error in the "for line in file:" line...telling me that there is TypeError: iteration over non-sequence.

Copy and paste what you have. When you put in your file name, did you surround it by quotes. IE. "myfile" vs myfile

Member Avatar
pythonbegginer
Newbie Poster
8 posts since Jul 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

This is what i have:

import re

infile = open ('file', 'r')
outfile = open('output', 'w')
column = 31

for line in infile:
     if not re.match('#', line):     


          line = line.strip()
          sline = line.split()
          outfile.write(sline[column] + '\n')

infile.close()
outfile.close()

It seems to now tell me that there is an IndenError: list index out of range for the line outfile.write(sline[column] + '\n')

Member Avatar
hughesadam_87
Posting Whiz in Training
274 posts since May 2009
Reputation Points: 54 [?]
Q&As Helped to Solve: 13 [?]
Skill Endorsements: 1 [?]
 
0
 

This is what i have:

import re

infile = open ('file', 'r')
outfile = open('output', 'w')
column = 31

for line in infile:
     if not re.match('#', line):     


          line = line.strip()
          sline = line.split()
          outfile.write(sline[column] + '\n')

infile.close()
outfile.close()

It seems to now tell me that there is an IndenError: list index out of range for the line outfile.write(sline[column] + '\n')

Does your file have at least 32 columns? From the image you posted, it seems like it only have like 5 or so. If you are picking a column outside of the range of your list you will get an error like that.

Member Avatar
pythonbegginer
Newbie Poster
8 posts since Jul 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Does your file have at least 32 columns? From the image you posted, it seems like it only have like 5 or so. If you are picking a column outside of the range of your list you will get an error like that.

Just set it to column 2 and it worked :icon_mrgreen: Thank you so much for your help!!

Question Answered as of 4 Years Ago by hughesadam_87
Member Avatar
pythonbegginer
Newbie Poster
8 posts since Jul 2009
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
0
 

Actually, I have two more questions, what if i want to grab another column from the same file and put it next to the one i just acquired? OR if i want to grab another column from another file (with the same format) and add that one next to the one from the first file?

Member Avatar
abhilam
Newbie Poster
6 posts since Jun 2012
Reputation Points: 0 [?]
Q&As Helped to Solve: 0 [?]
Skill Endorsements: 0 [?]
 
-1
 
*SIMULATION OVERVIEW FILE

*DSSAT Cropping System Model Ver. 4.5.1.023 -Stub         MAY 29, 2012; 13:32:51

*RUN   1        : N.American                SGCER045 KSAS1201    1              
 MODEL          : SGCER045 - Grain sorghum                                      
 EXPERIMENT     : KSAS1201 SG CLIMAT CHANGE STUDY ON KANSAS                     
 DATA PATH      : C:\DSSAT45\Sorghum\                                           
 TREATMENT  1   : N.American                SGCER045                            



@     VARIABLE                                         SIMULATED     MEASURED
      --------                                         ---------     --------
      Panicle Initiation day (dap)                            62          -99
      Anthesis day (dap)                                     115          -99
      Physiological maturity day (dap)                       160          -99
      Yield at harvest maturity (kg [dm]/ha)                8478          -99
      Number at maturity (no/m2)                           32377          -99
      Unit wt at maturity (g [dm]/unit)                   0.0262          -99
      Number at maturity (no/unit)                        1904.5          -99
      Tops weight at maturity (kg [dm]/ha)                 24579          -99
      By-product produced (stalk) at maturity (kg[dm]/ha   16101          -99
      Leaf area index, maximum                              6.81          -99
      Harvest index at maturity                            0.345          -99
      Grain N at maturity (kg/ha)                              0          -99
      Tops N at maturity (kg/ha)                               0          -99
      Stem N at maturity (kg/ha)                               0          -99
      Grain N at maturity (%)                                0.0          -99
      Tops weight at anthesis (kg [dm]/ha)                 17448          -99
      Tops N at anthesis (kg/ha)                               0          -99
      Leaf number per stem at maturity                     28.14          -99
      Emergence day (dap)                                      8          -99

I have data like above and i am trying read only a column below simulated. Actually this is only a sectioon of file and my file contains large data jst like this and have to read each section of column like this. For this i tried this code but dint work through

import re

infile = open ('c:/py/over.txt','r')
listme=infile.readlines()
outfile = open('c:/py/abhi.txt','w')
column = 56
for line in listme:
    if not re.match('@', line):
        line = line[13].strip()
        sline = line.split()
        outfile.write(sline[column] + '\n')
infile.close()
outfile.close()

I would be grateful for solving this problem

Member Avatar
snippsat
Veteran Poster
1,041 posts since Aug 2008
Reputation Points: 483 [?]
Q&As Helped to Solve: 382 [?]
Skill Endorsements: 10 [?]
 
0
 

abhilam we have answered you in this post.
http://www.daniweb.com/software-development/python/threads/425088/reading-a-column-from-text#post1817281
Why do you post same question in this two year old thread(marked solved)?,this is not good at all.

You
This question has already been solved: Start a new discussion instead
Post:
Start New Discussion
Tags Related to this Article