I am very new to programming but am hoping it will be able to speed a few things up for me. However I am struggling to work out how to write a Python3 script that does the following and am wondering if you could help me out.

I have a list of text files that are the output of a computational chemistry code.
At some point in each, (after a varying number of iteration steps) there will be the lines:

**** Optimisation achieved ****

Final energy = -348.67740315 eV
Final Gnorm = 0.00037832

After some data, the file goes on to have the lines:

Total number of defects = 1

Total charge on defect = -4.00

Defect centre is at 1.0000 0.0000 0.0000 Frac

The file goes on and at some point later on (again after varying number of iteration steps) there will be the lines:

**** Optimisation achieved ****

Final defect energy = 64.41932012
Final defect Gnorm = 0.00000283

N.B. If optimisation is not achieved the energies are still prited but are not of any interest to me. **
**N.B. The numbers here are taken from an example file (and are not the only numeric values within the file)

I know how to open/read each file within the directory. I also know how to make and write to a new file. My problem, however is this:

How do I find and print the lines 'Final Energy =...' and 'Charge on defect...' and 'Final defect energy=...' but only provided optimisation is achieved?

Hoping you can help.

3 Years
Discussion Span
Last Post by Gribouillis

I would start with regular expressions and itertools

import itertools
import re
wanted = (
     'Final energy =',
     'Total charge on defect =',
     'Final defect energy =',
     '**** Optimisation achieved ****',
regex = '^(?:{0})'.format('|'.join(re.escape(s) for s in wanted))
regex = re.compile(regex)
with open('filename.txt', 'rb') as lines:
    lines = itertools.ifilter(regex.match, lines)
    for line in lines:

Edited by Gribouillis


hi @gribouillis, apologies but I don't understand some of the syntax on the code you have shown. Would it be possible to talk me through it? Thanks


This line uses the string format() method to build a regular expression. For example

>>> import re
>>> wanted = ('cat', 'dog', 'parrot')
>>> regex = '^(?:{0})'.format('|'.join(re.escape(s) for s in wanted))
>>> regex

This regex is able to tell if a string starts with any of the words cat, dog and parrot. Read this for a tutorial on regular expressions and this for the format() method.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.