I am trying to learn Python by converting my C++ homework to Python. Below I am converting a Cobol file of employees to a comma delimited file.

from:
00001JAMES ADAMSON 010104000014550324201021500067500040010011593

to:
ID#,Last,First,Territory#,Office#,Salary,SSN,Department#,JobClass
00001,ADAMSON,JAMES,01,01,40000.00,145503242,01,02

I want to use EOF as my while loop control but I am getting a syntax error. Am I heading down the right path? Is there a better way to write this code? Any help would be great! Also this is my first posting about code, so if my format can be improved in any way, help here would be great too. Thanx in advance!

Lanier

fName=open('H:/CSC/Programs/CobolTran/Empcobol.dat')
WriteToo=open('H:/CSC/Employee.dat','w')
Trash='Start'

while Trash != EOF
	ID=fName.read(5)
	FullName=fName.read(26)
	Territory=fName.read(2)
	Office=fName.read(2)
	Salary=fName.read(6)
	SSN=fName.read(9)
	Dep=fName.read(2)
	JClass=fName.read(2)
	Trash=fName.readline()
	WriteToo.write(ID ','FullName','Territory',')

Lanier,

When using file i/o commands I use the following method. I am not an expert though so there may be a better way.

file = open('file.txt','r')
data = file.readlines()
while data:
      fname = data[0][5:10] #reads first line 5th to 10th characters
      data = data[1:] #chops of first line of data.

Hope this helps

David

You can also combine them and Python's garbage collector will then automatically close the file for you
for line in open(fname, "r"): ## One line at a time
## or
data = open(fname, "r").readlines()
for line in data:

Actually the whole exercise is a great example of Python's slicing operator ...

# slicing operator seq[begin : end(exclusive) : step]
# step is optional
# defaults are index begin=0, index end=len(seq), step=1
"""
separate info in line into
ID#,Last,First,Territory#,Office#,Salary,SSN,Department#,JobClass
00001,ADAMSON,JAMES,01,01,40000.00,145503242,01,02
"""
 
line = "00001JAMES ADAMSON 010104000014550324201021500067500040010011593"
# extract the id using slicing
id = line[0:5]
# test
print id, type(id)
 
# slice out the id and extract name and rest of data
# have to do it this way, because firstname and lastname is not
# of fixed length, but terminated with a space character
no_id = line[5:].split()
first = no_id[0]
last = no_id[1]
rest = no_id[2]
# test
print first, last, rest
 
# assume that next 6 data items are fixed in length
territory = rest[0:2]
office = rest[2:4]
salary = "%0.2f" % float(rest[4:10])
print territory, office, salary, type(salary)
# keep going ...
ssn = rest[10:19]
department = rest[19:21]
jobclass = rest[21:23]
# test
print ssn, department, jobclass
"""
my output so far -->
00001 <type 'str'>
JAMES ADAMSON 010104000014550324201021500067500040010011593
01 01 40000.00 <type 'str'>
145503242 01 02
"""
 
# the 9 extracted data variables look ok
# now form a comma separated string
format = "%s,%s,%s,%s,%s,%s,%s,%s,%s"
data = format % (id,first,last,territory,office,salary,ssn,department,jobclass)
# test it
print data
"""
my output -->
00001,JAMES,ADAMSON,01,01,40000.00,145503242,01,02
"""

Vega,
Thanx for the lesson in slicing. The process make so much sense now. The only thing I am unsure of is how would I run a while loop to slice an unknown number of lines from the file?
Lanier

# You can use a for loop
for line in open(fname, "r"):           ## One line at a time
     print line
#
# or a while loop
fp=open(fname, "r")
line = fp.readline()
while line:
     print line
     line = fp.readline()
fp.close()

This will work for the entire file. If you mean that you would want, say the first 100 lines, the while loop would be
while (line) and (ctr < 100):

Vega,
Thanx for the lesson in slicing. The process make so much sense now. The only thing I am unsure of is how would I run a while loop to slice an unknown number of lines from the file?
Lanier

Like woooee already pointed out, Python makes that very easy for you ...

# convert a COBOL data file to a csv file
 
def cobol2csv(line):
    """
    separate COBOL type info in line
    eg. 00001JAMES ADAMSON 010104000014550324201021500067500040010011593
    to a csv string
    ID#,Last,First,Territory#,Office#,Salary,SSN,Department#,JobClass
    eg. 00001,ADAMSON,JAMES,01,01,40000.00,145503242,01,02
    """
    # extract the id using slicing
    id = line[0:5]
    # slice out the id and extract name and rest of data
    # have to do it this way, because firstname and lastname is not
    # of fixed length, but terminated with a space character
    no_id = line[5:].split()
    first = no_id[0]
    last = no_id[1]
    rest = no_id[2]
    # assume that next 6 data items are fixed in length
    territory = rest[0:2]
    office = rest[2:4]
    salary = "%0.2f" % float(rest[4:10])
    ssn = rest[10:19]
    department = rest[19:21]
    jobclass = rest[21:23]
    # now form a comma separated string and return it
    format = "%s,%s,%s,%s,%s,%s,%s,%s,%s"
    data = format % (id,first,last,territory,office,salary,ssn,department,jobclass)
    return data
 
#infile = open(r'H:/CSC/Programs/CobolTran/Empcobol.dat')
#outfile = open(r'H:/CSC/Employee.dat','w')
# for test only ...
infile = open("empcobol.dat","r")
outfile = open("empcsv.dat", "w")
 
for cobol_line in infile:
    csv_line = cobol2csv(cobol_line) + '\n'  # add a new line char
    outfile.write(csv_line)
    # test
    print cobol_line
    print csv_line
This question has already been answered. Start a new discussion instead.