I've got a CSV file which has a bunch of stock information I need to whittle down to a specific format for another program to read. The first line in the CSV file contains the header, the CSV file contains 29 keys and I need to get down to 11. After that I have to do some tests on one of the keys values, removing the whole row if the test matches. Finally, each remaining row has to be jammed together as a single string with an EOL and output to a file.

This is the order I'm attempting to do things in:

  1. Read in the CSV file as a dictionary
    a. CSV = csv.dictreader(open('sampledata.csv', 'rb')
  2. Remove all the excess keys and their values from the dict
    a. del CSVreader doesn't work...?
  3. Perform a test on all the values of one of the keys and removes the entire row if it matches
    a. if ExeShare == 0 delete row from dict
  4. Print the leftover rows in a specific ordered format with NO commas
    a. Each rows data is joined together as one long string

I'm assuming because the first line contains the header, dictReader will correctly name all the key values, but whenever I try to del a key it throws "DictReader instance has no attribute '__delitem__'"

This is my first time working with the CSV module and I can't get past step 2, so any and all help would be most appreciated!

This short example may help you:

# experimenting with Python's csv module
# used to process Comma Separated Values
# the type of files generated by most common spreadsheet programs
# read as a dictionaries and remove one of the keys

import csv

# the csv test data
# here the header line is used for the keys
data = """\

# create the test file
fout = open("mydata.csv", "w")

# test it ...
# read the csv data file
# note that csv.DictReader is a generator
# and needs to be read again after use
dic_read = csv.DictReader(open("mydata.csv", "rb"))
# show each row 
# note that each row/line is a dictionary 
# the header line has been converted to the keys
for line in dic_read:

# refresh the generator
dic_read = csv.DictReader(open("mydata.csv", "rb"))
# create a list of processed dictionaries 
dict_list = []
for line in dic_read:
    # remove/delete key 'pay'
    del line['pay'] 


# show the list of modified dictionaries
for line in dict_list:

my output -->
{'pay': '33000', 'age': '23', 'name': 'ted', 'weight': '188'}
{'pay': '37000', 'age': '37', 'name': 'mary', 'weight': '241'}
{'pay': '41000', 'age': '26', 'name': 'bob', 'weight': '167'}
{'age': '23', 'name': 'ted', 'weight': '188'}
{'age': '37', 'name': 'mary', 'weight': '241'}
{'age': '26', 'name': 'bob', 'weight': '167'}


Thanks for the help, I ended up not using the csv module and came up with something like this:

FileIn = open('example_data.csv','rb')
titles = FileIn.next().strip().split(',')

for row in FileIn:
    values = row.strip().split(',' )
    data = dict(zip(titles, values))

# Make sure shares were executed
    if int(data[' ExeShares']) > 0:
        Date = data[' Date']        
        Account = data[' Account']
        Symbol = data[' Symbol']
        Route = data[' Route']
        ExeShares = data[' ExeShares']
        ExePrice = data[' ExePrice']
        print Date, Account, Symbol, Route, ExeShares,ExePrice

Any thoughts?

Sure you can write it the long way, but remember that Python modules are well tested and highly optimized.