Confused about reading files

Question

Rebecca_2

10 Years Ago

I have a series of (~950KB) '.txt' output files from a computational chemistry program; Each file will contain the line '****optimisation achieved****' at least once and, depending on the result of the calculation, possibly twice. Does the following code, in which I am trying to find specific lines and print them to a new file, differentiate between the two occurrences of that line?

import os

with open('results.txt', 'a') as writer:
    for file in os.listdir('.'):
        if file.endswith('.out'):
            print(file + ' ', end= ' ', file=writer)
            with open(file, 'r') as reader:
                for line in reader.readlines():
                    s=line.strip()               
                    if s=='**** Optimisation achieved ****':
                        opt='y'                        
                    elif s.startswith('Final energy ='):
                        if opt=='y':
                            print(s + ' ', end=' ', file=writer)            
                    elif s.startswith('Total number of defects'):                        
                        print(s + ' ', end=' ', file=writer)            
                    elif s.startswith('Total charge on defect'):                        
                        print(s + ' ', end=' ', file=writer)            
                    elif s.startswith('Defect centre'):
                        print(s+ ' ', end=' ', file=writer)
                    elif s.startswith('Fractional'):
                        if s!='Fractional coordinates of asymmetric unit :':
                            print(s + ' ', end=' ', file=writer)                    
                    elif s.startswith('Final defect energy'):
                        if opt=='y':
                            print(s, file=writer)

(Please be patient, I am relatively new to programming)

python

4 Contributors
5 Replies
284 Views
2 Days Discussion Span
Latest Post 10 Years Ago Latest Post by snippsat

All 5 Replies

woooee 814 Nearly a Posting Maven

10 Years Ago

You can send a tuple to startswith

 ## note that opt is never set back to "" or "n"
 if s.startwith(('Total number of defects', 'Total charge on defect', etc.))

Are you saying that you want to stop looking after the second if s=='**** Optimisation achieved ****':

if s=='**** Optimisation achieved ****':
    if opt=='y':
        opt='n'
    else:
        opt="y"

Edited 10 Years Ago by woooee

Gribouillis commented: I forgot about the tuple ! +14

Gribouillis 1,391 Programming Explorer

10 Years Ago

Here is a complete (simplified) running example. Try it in the directory with the .out files

#!/usr/bin/env python3
#-*-coding: utf8-*-
import os

# split the code into several functions to lighten it

def main():
    with open('results.txt', 'a') as writer:
        for file in os.listdir('.'):
            if not file.endswith('.out'):
                continue
            with open(file, 'r') as reader:
                handle_reader(reader, writer)

def handle_reader(reader, writer):
    print('reading file:', reader.name, file = writer)
    opt_cnt = 0
    for line in reader:
        s=line.strip()               
        if s=='**** Optimisation achieved ****':
            opt_cnt += 1 # <-- count those lines
            print('optimisation line number', opt_cnt, end ='\n', file = writer)
        else:
            pass

if __name__ == '__main__':
    main()

Edited 10 Years Ago by Gribouillis

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 1 · 2013-09-14T19:29:15+00:00

I would rather count the optimisation achieved lines while reading:

import os

with open('results.txt', 'a') as writer:
    for file in os.listdir('.'):
        if file.endswith('.out'):
            print(file + ' ', end= ' ', file=writer)
            opt_cnt = 0 # <-- reset counter for each file
            with open(file, 'r') as reader:
                for line in reader.readlines():
                    s=line.strip()               
                    if s=='**** Optimisation achieved ****':
                        opt_cnt += 1 # <-- count those lines     
                    elif s.startswith('Final energy ='):
                        if opt_cnt >= 1: # <-- base decisions on the current value
                            print(s + ' ', end=' ', file=writer)
                    ...

Rebecca_2 · Answer 2 · 2013-09-15T14:37:54+00:00

sorry, but what prevents the code from only ever reaching the first '****optimisation achieved****' and counting that line repeatedly?

snippsat 661 Master Poster · Answer 3 · 2013-09-17T16:29:50+00:00

sorry, but what prevents the code from only ever reaching the first '****optimisation achieved****' and counting that line repeatedly?

Here is an another way to this.
Here i want to get file name and count back,and also correct line number back.

file_1.txt:

fire
fox
**** Optimisation achieved ****

file_2.txt:

**** Optimisation achieved ****
car ¨
123
**** Optimisation achieved ****
**** Optimisation achieved ****
**** Optimisation achieved ****

file_3.txt:

**** Optimisation achieved ****
hello
world
**** Optimisation achieved ****

So a manual count would be.
file_1 has 1 "Optimisation" count at line 3
file_2 has 4 "Optimisation" count at line 1,4,5,6
file_3 has 2 "Optimisation" count at line 1,4

Some code for this.

import re
from glob import glob

count = {}
line_numb = []
for files in glob('*.txt'):
    #print(files)
    with open(files) as f_in:
        for num, line in enumerate(f_in, 1):
            line = line.strip()
            if '**** Optimisation achieved ****' in line:
                count[f_in.name] = count.get(f_in.name, 0) + 1
                line_numb.append(num)
        line_numb.append(f_in.name)

line_numb = ' '.join(str(i) for i in line_numb)
line_numb = re.split(r'\w+.txt', b)
line_numb.pop()
opt_count = (sorted(count.items(), key=lambda x: x[0]))

print('-'*5)
print(line_numb)
print(opt_count)
print('-'*5)

with open('result.txt', 'w') as f_out:
    for line, count in zip(line_numb, opt_count):
        print('{} has "Optimisation" count of {}\n"Optimisation" occur at line nr: {}\n'.format(count[0], count[1], line.strip()))
        #f_out.write('{} has "Optimisation" count of {}\n"Optimisation" occur at line nr: {}\n'.format(count[0], count[1], line.strip()))


"""Ouptput-->
-----
['3 ', ' 1 4 5 6 ', ' 1 4 ']
[('file_1.txt', 1), ('file_2.txt', 4), ('file_3.txt', 2)]
-----
file_1.txt has "Optimisation" count of 1
"Optimisation" occur at line nr: 3

file_2.txt has "Optimisation" count of 4
"Optimisation" occur at line nr: 1 4 5 6

file_3.txt has "Optimisation" count of 2
"Optimisation" occur at line nr: 1 4
"""

Confused about reading files

Recommended Answers Collapse Answers

All 5 Replies

Recommended Answers