I have a file of cencus data that I need to read and then manipulate in several ways. The file looks like this.

CountyName
Population Housing Location

Adams County
34340 15175 W
Attala County
19661 8639 E
Benton County
8026 3456 N
Bolivar County
40633 14939 N
Calhoun County
15069 6902

at this point I am only trying to read the data in. Once that is complete I was thinking about using a dic with the county name as the key, and then playing with different ways to format the output. i.e. printing all the (N)orth counties , or totaling the population.

below some code I wrote to read and format the data like this

CountyName Population Housing Location

#ifstream to get data for censuss.
infile=open("G:/CSC/Programs/MSCEN/Misscen.dat")
lineodd=infile.readline()
while 1:
   lineeven=infile.readline()
   line=lineodd+" "+lineeven
   lineodd=infile.readline()
   print line
infile.close()

I have suceeded at createing what looks like an infinte loop. So the question is, how do I make the loop stop after I get all the data in and is my plan of attack for the rest of the program going to work or am I way off course.

This was a program I had to write for a C++ course I took and I am using it to teach myself python. Thanx in advance!!

Lanier

As luck would have it, I am learning Python in my bio course and came up with this. Hope it helps.

Python is a very elegant and higher level language, so you don't want to exactly think like you do with C++.

First I had to create a test data file:

# write the test data file
# alternating lines of ...
# county name
# population housing location

data = """Adams County
34340 15175 W
Attala County
19661 8639 E
Benton County
8026 3456 N
Bolivar County
40633 14939 N
Calhoun County
15069 6902 S"""

fout = open("county_test.txt", "w")
fout.write(data)
fout.close()

print "data file county_test.txt has been written"

Now i can read the data file back in and process it:

# create a dictionary from the data file
# dictionary pair = county_name: [population, housing, location]
# then process the county data

fin = open("county_test.txt", "r")

county_dic = {}

for item in fin:
    # strip off trailing newline char
    item = item.rstrip()
    if item[0].isalpha():
        county_name = item
    else:
        # create a list of the county's values
        county_values = item.split()
        # form the dictionary
        county_dic[county_name] = county_values

print county_dic
    
"""
my output (prettied up a little) =
{
'Bolivar County': ['40633', '14939', 'N'],
'Adams County': ['34340', '15175', 'W'],
'Attala County': ['19661', '8639', 'E'],
'Calhoun County': ['15069', '6902', 'S'],
'Benton County': ['8026', '3456', 'N']
}
"""

def add_population(county_dic):
    """add population of all counties in data file"""
    population_sum = 0
    for key, val in county_dic.iteritems():
        # list element at index zero is population
        population_sum += int(val[0])
    return population_sum
        

print "Total population of all counties in data file:"
print add_population(county_dic)

"""
my output =
Total population of all counties in data file:
117729
"""

You can use the loop that iterates over the dictionary for most of your other data processing.

I am assuming that every other record is a county followed by population housing location. This is how to use a readline() loop, (and is untested code):

def process_county(r_list):
   print "County =", r_list[0]
   print "Housing Loc =", r_list[1]

fp=open(filename, "r")
r_list=[]
rec=fp.readline()     ##county rec
while rec:
   r_list.append(rec)
   rec=fp.readline()     ## housing rec
   r_list.append(rec)
   process_county(r_list)
   r_list=[]
   rec=fp.readline()     ## county rec
fp.close()

Z,
That is just the info I was looking for and the method I was trying to use. Is there some were I can read about the for () in () loop in detail. I have found some info on it but nothing really clear. Thanx for the help

Lanier

The 'for loop' is the standard iteration loop in Python. Here are some simple examples:

# iterate through the characters of a string
for c in 'string':
    print c

"""
my result --->
s
t
r
i
n
g
"""


# iterate through the elements of a list
for n in [1, 2, 3]:
    print n

"""
my result --->
1
2
3
"""

# or since range(1, 4) --> [1, 2, 3]
for n in range(1, 4):
    print n

"""
my result --->
1
2
3
"""

# to get key, val pair of a dictionary use iteritems()
for key, val in {'Bob': 49, 'Fred': 35, 'Bill': 19}.iteritems():
    print "%s is %d years old" % (key, val)

"""
my result --->
Bob is 49 years old
Bill is 19 years old
Fred is 35 years old
"""

Thanxk sneek!! Once again I just needed it broken down to my level, whitch is pretty low:)

This article has been dead for over six months. Start a new discussion instead.