Parsing and print file to last iteretion

Question

maddocspace 0 Newbie Poster

12 Years Ago

Hello there,

Sorry, unfortunately python parsing is not my main activity, it does not print the last block of the statement.

i should get:
*elset, elset=top_s1
1, 2, 3, 4
*surface, name=top, type=element
top_s1

If i add in the test2 file a line such as *surface, name=bot, type=element then it does print it, what can i change in order to fix it

zip file contains the input file

#!/usr/bin/env python

import os
import sys
import re
import fileinput
import math

surfaceFound = False

f=open("test2.inp","r")
text = f.readlines()
f.close()

e1 = []; e2 = []; e3 = []; e4 = []; e5 = []; e6 = []
s1 = []; s2 = []; s3 = []; s4 = []; s5 = []; s6 = []
elsurf={}
set_name = ""
otherLine = []
finalSet=[]
final_block=[]
elsetName=""
lineTrik = []

for index, lines in enumerate(text):
    nline = lines.strip()
    lineTrik.append(lines)
    surface = "*SURFACE" in lines.strip()
    element = "TYPE = ELEMENT" in lines.strip()
    digit = lines.strip().split(",")[0].isdigit()
    under = lines.strip().startswith("_")
    stopParse = "*End Part" is lines.strip()
    HM = lines.strip().startswith("**HM")
    HW = lines.strip().startswith("**HW")
    S1 = lines.strip().endswith("S1")
    S2 = lines.strip().endswith("S2")
    S3 = lines.strip().endswith("S3")
    S4 = lines.strip().endswith("S4")
    S5 = lines.strip().endswith("S5")
    S6 = lines.strip().endswith("S6")
    if surface:
        starSurf = nline.split(",")[0].strip()
        nameSurf = nline.split(",")[1].split("=")[1].strip()
        typeSurf = nline.split(",")[2].strip()
        elsetName = nline.split(",")[1].split("=")[1].strip()
        surfaceFound = False
        for set_name in elsurf.keys():
            
            finalSet.append("*elset, name = %s_%s\n" % (card_name, set_name))
            for line in [elsurf[set_name][i*16:(i+1)*16] for i in range(0,(len(elsurf[set_name])/16)+1)]:
                if line:
                    eset = ",".join(line).strip()
                    finalSet.append("%s\n" % (eset))
            final_block.append("%s_%s, %s\n" % (card_name, set_name, set_name))
        elsurf={}
        (this_card_name, card_type) = (nameSurf, element)
        if this_card_name and element:
            surfaceFound = True
            card_name = elsetName
            surfType = typeSurf
            final_block.append("*surface, name=%s, %s\n" % (card_name, surfType))
    elif (S1 or S2 or S3 or S4 or S5 or S6) and under == False:
        surfaceFound = True
        value = nline.strip().split(",")[0].strip()
        key = nline.strip().split(",")[1].strip()
        items = elsurf.setdefault(key, [])
        items.append(value)


    elif under:
        underLine = nline
        final_block.append("%s\n" % underLine)
    elif stopParse:
        endParse = nline
        final_block.append("%s\n" % endParse)
    elif HM or HW:
        pass
    else:
        other = nline
        otherLine.append("%s\n" % other)
for i in otherLine:
    print i.strip()
for i in finalSet:
    print i.strip()
for i in final_block:
    print i.strip()

python

This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.

test2.inp_.zip (1.2 KB)

4 Contributors
13 Replies
208 Views
1 Week Discussion Span
Latest Post 12 Years Ago Latest Post by maddocspace

All 13 Replies

woooee 814 Nearly a Posting Maven

12 Years Ago

There is no way to make heads or tails of this code for those of us who don't know what you are trying to do. Some general suggestions

1. Break this code into functions and test each function
2. instead of surface = "*SURFACE" in lines.strip()
   and then if surface:  (which means nothing to anyone reading the code)
   just use if "*SURFACE" in lines.strip(): as it is obvious what the test is
3. nameSurf and elsetName are the same, i.e. [1] (obviously you did not test this)
        starSurf  = nline.split(",")[0].strip()
        nameSurf  = nline.split(",")[1].split("=")[1].strip()
        typeSurf  = nline.split(",")[2].strip()
        elsetName = nline.split(",")[1].split("=")[1].strip()

   instead use: interim_list = nline.split(",") i.e. one split instead of many
        starSurf  = interim_list[0].strip()
        nameSurf  = interim_list[1].split("=")[1].strip()
        etc.
4. instead of
    S1 = lines.strip().endswith("S1")
    S2 = lines.strip().endswith("S2")
    etc. and
   elif (S1 or S2 or S3 or S4 or S5 or S6) and under == False:
   use
   ending = lines.strip()[-2:]
   elif (under==False) and ending in ("S1", "S2", "S3", "S4", "S5", "S6"):
5. and similarly replace
    HM = lines.strip().startswith("**HM")
    HW = lines.strip().startswith("**HW")
    elif HM or HW:
    with
    if lines.strip().startswith("**HM", "**HW"):
6. eliminate all of the many lines.strip() call with one
    line_strip = lines.strip() and use line_strip

Edited 12 Years Ago by woooee because: n/a

TrustyTony 888 pyMod

12 Years Ago

Small typo I think:
if lines.strip().startswith("**HM", "**HW"):

should be

if lines.strip().startswith(("**HM", "**HW")):

And same way you can use:

elif not under and line_strip.endswith(("S1", "S2", "S3", "S4", "S5", "S6")):

and the not under you can eliminate if you move the branch one down after

elif under:

Edited 12 Years Ago by TrustyTony because: n/a

woooee 814 Nearly a Posting Maven

12 Years Ago

Small typo I think:
if lines.strip().startswith("**HM", "**HW"):
should be
if lines.strip().startswith(("**HM", "**HW")):

It's good to have multiple pairs of eyes checking things.

Edited 12 Years Ago by woooee because: n/a

woooee 814 Nearly a Posting Maven

12 Years Ago

This should be enough to start things off, but I'm not sure about what is to be done with the "S1", "S2", etc. output.

def process_group(group_list, fp_surface, fp_underline):
    result = test_for_underline(group_list, fp_underline)
    if not result:
        tup_of_Sx = ("S1", "S2", "S3", "S4", "S5", "S6")
        for rec in group_list:
            rec_split = rec.strip().split(",")
            ending=rec_split[-1].strip()
            if ending in tup_of_Sx:
                idx = tup_of_Sx.index(ending)
                output_list = ["         " for x in range(6)]
                output_list[idx]="%9s" % (rec_split[0].strip())
                fp_surface.write("%s\n" % ("".join(output_list)))
            else:   ## headings
                fp_surface.write("\n")
                fp_surface.write(rec)
                for Sx in tup_of_Sx:
                    fp_surface.write("%9s" % (Sx))
                fp_surface.write("\n")

def test_for_underline(group_list, fp_underline):
    """ see if there is a record that starts with an underline.
        If found, write to the underline file and return True
    """
    for rec in group_list:
        if rec.strip().startswith("_"):
            fp_underline.writelines(group_list)
            return True

    return False

surface_found = False
end_found = False
group_list = []

fp_not_surface = open("./test_not_surface", "w")  ## records before *SURFACE
fp_surface = open("./test_surface", "w")          ## records after *SURFACE
fp_underline = open("./test_underline", "w")

fname = "./test2.inp"
fp_in = open(fname, "r")
for rec in fp_in:
    if "end part" in rec.lower():
        end_found = True
    if not end_found:
        if "*SURFACE" in rec:     ## process previous group of records
            surface_found = True
            if len(group_list):
                process_group(group_list, fp_surface, fp_underline)
                group_list = []
        if surface_found:
            if not rec.strip().startswith(("**HM", "**HW")):  ## omit these
                group_list.append(rec)
        else:     ## records before the first "*SURFACE"
            fp_not_surface.write(rec)

if len(group_list):
    process_group(group_list, fp_surface, fp_underline)


fp_in.close()
fp_not_surface.close()
fp_surface.close()
fp_underline.close()

Edited 12 Years Ago by woooee because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

maddocspace 0 Newbie Poster · Answer 1 · 2011-08-26T01:37:02+00:00

pyTony,

Sorry my bad.

what the code should do is to parse the attached file, read and print (I should say wrote into a file) all the line before it reaches the first "*SURFACE", once there read each line and group it in the keys (s1, s2, s3, s4, s5, s6) and write a line on top of each group, as follow:

*elset, elset = namesurf_keys

then if there are other "*SURFACE" but followed by "_" print as follow:
*SURFACE, (and whatever else is in the line)
_ElsetName, keys

then stop once reaches *END PART

thanks for all the suggestions and I will make all the modes tomorrow and post the revised code

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 2 · 2011-08-26T08:14:46+00:00

Perhaps you could take your input file test2.inp and write another file by hand with your exact expected output so that we understand what your program is supposed to do.

maddocspace 0 Newbie Poster · Answer 3 · 2011-08-26T12:32:41+00:00

maddocspace 0 Newbie Poster

12 Years Ago

Thanks everyone for your contribution, long way to go before I am going to be able to code decently.

I attached what the result should be. the new parts are between lines of stars

This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.

test2result.inp_.zip (1.59 KB)

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 4 · 2011-08-26T13:39:38+00:00

Here is my suggestion of cleanup of original code, output is identical, except buggy test for end part ('is' instead of 'in') correction puts end part at end of output:

#!/usr/bin/env python

elsurf={}
other_line = []
final_set=[]
final_block=[]

for index, line in enumerate(open("test2.inp","r")):
    line_strip = line.strip()
    splitted = [tostrip.strip() for tostrip in line_strip.split(",")]
    
    if "*SURFACE" in line_strip:
        # output previous collected block
        for set_name in elsurf:
            final_set.append("*elset, name = %s_%s" % (card_name, set_name))
            for line in [elsurf[set_name][i*16:(i+1)*16] for i in range(0,(len(elsurf[set_name])/16)+1)]:
                if line:
                    final_set.append(",".join(line))
            final_block.append("%s_%s, %s" % (card_name, set_name, set_name))
        # resetting elsurf for this surface block
        elsurf={}
        this_card_name = splitted[1].split("=")[1]
        if this_card_name and "TYPE = ELEMENT" in line_strip:
            card_name = splitted[1].split("=")[1]
            surfType = splitted[2]
            final_block.append("*surface, name=%s, %s" % (card_name, surfType))

    elif line_strip.startswith("_"):
        final_block.append(line_strip)

    elif line_strip.endswith(("S1","S2","S3", "S4", "S5", "S6")):
        elsurf.setdefault(splitted[1], []).append(splitted[0])

    # makes difference in output compared to original, as 'in' fixed
    elif "*End Part" in line_strip:
        final_block.append(line_strip)

    elif not line_strip.startswith(("**HM","**HW")):
        other_line.append(line_strip)
        
print('\n'.join(other_line))
print('\n'.join(final_set))
print('\n'.join(final_block))

TrustyTony 888 pyMod Team Colleague Featured Poster · Answer 5 · 2011-08-26T14:13:28+00:00

Little cleaner still to eliminate the this_card_name (here also numbers grouped little different way):

#!/usr/bin/env python
from itertools import izip_longest

elsurf={}
other_line = []
final_set=[]
final_block=[]

for index, line in enumerate(open("test2.inp","r")):
    line_strip = line.strip()
    splitted = [tostrip.strip() for tostrip in line_strip.split(",")]
    
    if "*SURFACE" in line_strip:
        # output previous collected block
        for set_name in elsurf:
            final_set.append("*elset, name = %s_%s" % (card_name, set_name))
            numbers = []
            for number in elsurf[set_name]:
                numbers.append(number)
                if len(numbers) == 16:
                    final_set.append(",".join(numbers))
                    numbers = []
            if numbers:
                final_set.append(",".join(numbers))

            final_block.append("%s_%s, %s" % (card_name, set_name, set_name))
        # resetting elsurf for this surface block
        elsurf={}

        if '=' in splitted[1] and "TYPE = ELEMENT" in line_strip:
            card_name = splitted[1].split("=")[1]
            final_block.append("*surface, name=%s, %s" % (card_name, splitted[2]))

    elif line_strip.startswith("_"):
        final_block.append(line_strip)

    elif line_strip.endswith(("S1","S2","S3", "S4", "S5", "S6")):
        elsurf.setdefault(splitted[1], []).append(splitted[0])

    # makes difference in output compared to original, as 'in' fixed
    elif "*End Part" in line_strip:
        final_block.append(line_strip)

    elif not line_strip.startswith(("**HM","**HW")):
        other_line.append(line_strip)
        
print('\n'.join(other_line))
print('\n'.join(final_set))
print('\n'.join(final_block))

This will only correct the 'end part' to end of file. I did not check yet your desired output.

maddocspace 0 Newbie Poster · Answer 6 · 2011-08-26T14:45:22+00:00

Thanks again pyTony, I was working on cleaning up the code as suggested.

It is easier on the eye.

maddocspace 0 Newbie Poster · Answer 7 · 2011-09-01T14:56:51+00:00

Hi,

I haven t be able to make the script to write the last surface as specified in the zip file attached, as below.

*elset, elset=top_s3
4972, 4975, 4987, 4990, 4999, 5002, 5011, 5014, 5023, 5026, 5035, 5386, 5399, 5400 
*surface, name=TOP, TYPE = ELEMENT
top_s3

I have tried to rap the main if loop into a while loop but i get either a infinite loop or nothing.

any pointers?

cheers

Gribouillis 1,391 Programming Explorer Team Colleague · Answer 8 · 2011-09-03T02:28:19+00:00

Hi,
I haven t be able to make the script to write the last surface as specified in the zip file attached, as below.
*elset, elset=top_s3
4972, 4975, 4987, 4990, 4999, 5002, 5011, 5014, 5023, 5026, 5035, 5386, 5399, 5400 
*surface, name=TOP, TYPE = ELEMENT
top_s3
I have tried to rap the main if loop into a while loop but i get either a infinite loop or nothing.
any pointers?
cheers

Here is a small program which does not completely write your output file, but if finds the surfaces of interest in your input file and collects the data for these surfaces. With a little more work, you should be able to produce your output.

from collections import defaultdict
import itertools
import re
name_regex = re.compile(r"NAME\s*=\s*(?P<name>\w+)")
s_regex = re.compile(r"^\s*(?P<number>\d+)\s*,\s*(?P<s>S\d)\s*$")

def find_surfaces(lines):
    n = len(lines)
    for lineno, line in enumerate(lines):
        if line.startswith("*SURFACE,"):
            match = name_regex.search(line)
            i = lineno + 1
            if i < n and s_regex.match(lines[i]):
                yield lineno, match.group("name")

def parse(filename):
    lines = open(filename).readlines()
    for lineno, name in find_surfaces(lines):
        print lineno, name
        D = defaultdict(list)
        for i in itertools.count(lineno+1):
            match = s_regex.match(lines[i])
            if match:
                number, s = match.group("number"), match.group("s")
                D[s].append(number)
            else:
                break
        print dict(D)
        print "next line is ", i


if __name__ == "__main__":
    parse("test2.inp")

""" my output -->
41 CONTACT_SURF_LWRSTR_TO_LWRCOVER
{'S2': ['203', '202', '200', '199', '198', '197', '196', '201', '195', '194', '192', '1593', '1592', '1589', '1590', '1587', '1591', '1586', '1585', '1588'], 'S1': ['6503', '6502', '6512', '6517', '6666', '6516', '6515', '6514', '6513', '6510', '6511', '8158985', '6507']}
next line is  75
75 CONTACT_SURF_LWRSTR_TO_RIBFEET
{'S1': ['6503', '6502', '6512', '6517', '6666', '6516', '6515', '6514', '6513', '6510', '6511', '8158985', '6507']}
next line is  89
116 TOP
{'S3': ['4972', '4975', '4987', '4990', '4999', '5002', '5011', '5014', '5023', '5026', '5035', '5386', '5399', '5400']}
next line is  131
"""

Edit: notice that lines are numbered from 0 in the program, so line 75 is line 76 in an ordinary editor...

maddocspace 0 Newbie Poster · Answer 9 · 2011-09-05T16:50:49+00:00

Gribouillis thanks for your help. I will modify your script and try to get the output i need.

I was also looking for and explanation of why the script that i submitted, then edited, was not getting through the last iteration.

cheers

Parsing and print file to last iteretion

Recommended Answers Collapse Answers

All 13 Replies

Recommended Answers