Is there an easy way to find and copy a table from a text file?

Question

Rebecca_2

11 Years Ago

Hi,
I have an output file (.txt) from a computational chemistry program. At some point in this file, following an unknown number of iterative steps, the following table will be found:

Comparison of initial and final structures : 

--------------------------------------------------------------------------------
  Parameter   Initial value   Final value   Difference    Units      Percent
--------------------------------------------------------------------------------
    Volume       
    a            
    b             
    c             
    alpha         
    beta         
    gamma         
      1 x         
      1 y          
      1 z          
      2 x          
      2 y          
      2 z          
      3 x          
      3 y          
      3 z          
      4 x          
      4 y          
      4 z          
      5 x          
      5 y          
      5 z          
      6 x          
      6 y          
      6 z          
      7 x          
      7 y          
      7 z          
      8 x          
      8 y          
      8 z          
--------------------------------------------------------------------------------

Is there an easy method for:
1. numbering the lines in the file?
2. finding this table
3. copying the table exactly
4. extracting each row (or rather rows 1-6) into separate files

I know how to open and read a file, [i.e. with open('output.txt', 'r') as f] but am a little bemused by the rest.

I ask as, although this is a very easy 'point and click using a mouse' task for one file, but that would be very tedious and time consuming to do for the several hundred files I actually have.

Any help would be appreciated, although please be patient as I am not fully up to speed with programming yet - particularly with regard to formatting.

Cheers

python

3 Contributors
4 Replies
261 Views
21 Hours Discussion Span
Latest Post 11 Years Ago Latest Post by slate

slate 241 Posting Whiz in Training

11 Years Ago

Please be more specific.

Some ideas:

numbering the lines in the file?

with open('output.txt', 'r') as f
    for line_number, line in enumerate(f):
        pass # do something with the line

finding this table

if line.startswith("Comparison of initial and final structures :"):
    table_begin=True
if table_begin and line.startswith("------"):
    table_end=True

copying the table exactly

Copy to where?

if table_begin and not table_end:
    pass # copy the line to whereever

extracting each row (or rather rows 1-6) into separate files

Open separate files for writing. Donno what is expected to do.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Rebecca_2 · Answer 1 · 2014-01-23T17:54:24+00:00

Hi Slate,

Please be more specific.

Can I ask what is ambiguous?
What I have given in my original post is a sample (albeit empty) table: all of my output files will have one occurrence of this table in the same formatting; the only difference will be the numbers contained but, as these are not known and are specific to each file, are useless information here.

Taking each point in turn:
**1. Numbering lines in a file **
I ask as this is potentially useful for and I don't know how to do it.
For instance (as per another of my posts):
The output file will have a line which gives the final (lattice) energy. However, this is only useful if an energy minimum is found which, if so, is designated by the phrase '****optimisation achieved****' appearing three lines above the final energy.
i.e.

     **** Optimisation achieved ****


      Final energy =    -385.41833439 eV

I was thinking about a way of extracting this data without having multiple if/elif statements as I have been doing. So I was thinking, if if number the lines of the file, search for the line '****Optimisation achieved****: if it is found return the line number, then use this in order to extract the final (lattice) energy.

Whilst I am at it, I know there is a feature where I can strip white space from the start and end of a line but, is there a way of removing blank lines entirely?

2. Finding the table
3. Copying the table exactly
When I say copy, I mean into a new text file, 'table.txt', which of course needs creating.
The table gives me important information that I need. However, thinking about it, copying it exactly may not be the most useful of things to do. However, it would be usful to know how to extract a complete table from the file. Again, I was wondering if, potentially, numbering the lines of the file may in fact be useful with this.

if table_begin and line.startswith("------"):
    table_end=True
if table_begin and not table_end:
    pass # copy the line to whereever

Joining together the bits of code you have posted - won't this simply copy the table headers and not the rows which make up the table (and thus contain the data)?

4. Extracting each row (or rather rows 1-6) into separate files
This is going to be far more useful to me than copying the whole table. As I said towards the end of the original post, whilst this may seem like a simple 'point and click' task for one file, I actually have several hunderd output files and thus would be time consuming - not to mention tedious.
I literally want to split the table by each row into separate text files (or maybe even a csv file) - which I can then use in another program.
If I create 6 textfiles:
'a.txt', 'b.txt', 'c.txt', 'alpha.txt', 'beta.txt', 'gamma.txt'
I then extract the corresponding row from the table I am trying to find/copy along with the output filename. If I iterate this over all my output files, I can then build a potential energy surface etc.

The bits I am asking on here about is the finding and copying of the table; what happens after that is not important here.

Hope this clarifies any ambiguity....

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 2 · 2014-01-23T23:30:01+00:00

Finding the table

Copying the table exactly

with open('data.txt') as data:
    data_parts = data.read().split("""Comparison of initial and final structures : 

--------------------------------------------------------------------------------""")
print data_parts[1]

slate 241 Posting Whiz in Training · Answer 3 · 2014-01-24T11:36:45+00:00

First of all:

Can I ask what is ambiguous?

Requirenements are ambiguous, not full and vague.

Is there only one of this table in the input file?
Ends the input at the end of the given sample?
We should number the lines in the file. Every line, specific lines, what to do with the numbers?

And so on.

I have the impression, you are better at writing essays then specifications.

Secondly:
In the first post your requirements seem to be the following.

There is an input file called output.txt which ends as the given sample.
Find the table given in the example
Extract the table without the first 3 line and write it into a file named table.txt
All sentences with "or better" or "instead of how I am doing now".

This can be done with PyTony's code, the following way:

with open('output.txt') as fdata:
    data_parts = fdata.read().split("""Comparison of initial and final structures : 

--------------------------------------------------------------------------------""")

with open('table.txt',"w") as ftable:
    ftable.write(data_parts[1])

In the second post these requirements seemed to change into this:

There are input files called output1.txt, output2.txt and so on.
All input files end as given in the sample in this table.
In every input file find this "table" in the file.
Find the first line contains only "a" (without leading and trailing whitespace) in the "table" and append this line to a file "a.txt". Create the file if necessary.

That is achieved by this code:

from glob import glob


afile=None
for fname in glob("output*.txt"):
    with open(fname) as fdata:
        table_found=False
        for line in fdata:
            if not line.startswith("Comparison of initial and final structures :"):
                table_found=True
            if table_found and line.strip()=="a":
                if not afile:
                    afile=open("a.txt","w")
                afile.write(line)
                break

if afile: afile.close()