Ok so im working on a script that reads a large file then splits it up into seperate files using the field in it called PC Number. What im wanting to do is check if the file already exist if it does then it just appends the new data but if it doesnt it writes a new file with headers and everything. If there is a simple way to do this i would b greatly appriciated to know how to do it cause so far its got me really confused. Below is the the code that im useing to read and write the data.

import csv, datetime, os, shutil, glob, os.path

newrow={'PC Number': '', 'INSTALL NEW LUXURY SIGN - YES/NO': '',  'Status': '',  'Street Address': '',  'City': '', 'State': '',  'Zip': '',  'County': '', 'MLS#': '',  'Price': '',  'Agent Name': '',  'Office Name': '', 'Region': '',  'Office Address 1': '',  'Office Address 2': '',  'Office City': '',  'Office State': '',  'Office Zip': '',  'Office Phone': '',  'Number Manager': '',  'Name Manager': '',  'Email Address': '',  'Admin Name': '',  'Admin Email Address': '', 'Listing Expiration Date': ''}
new_field_names = newrow.keys()
dt = datetime.datetime.now().strftime("%m_%d_%y_%H_%M_%S")

os.chdir("/Users/HatterX/Desktop/Unprocessed Trello")
for FILE in glob.glob("LuxuryListings-040714*"):
    with open(FILE, 'r') as f1:
        cf1 = csv.DictReader(f1, fieldnames=('PC Number', 'INSTALL NEW LUXURY SIGN - YES/NO',  'Status',  'Street Address',  'City',  'State',  'Zip',  'County',  'MLS#',  'Price',  'Agent Name',  'Office Name',  'Region',  'Office Address 1',  'Office Address 2',  'Office City',  'Office State',  'Office Zip',  'Office Phone',  'Number Manager',  'Name Manager',  'Email Address',  'Admin Name',  'Admin Email Address', 'Listing Expiration Date' ))
        #cf2.writeheader()
        for row in cf1:
            with open('pc_Numbers_'+row['PC Number'].strip()+'.csv', 'ab+') as fid:
               cf2 = csv.DictWriter(fid, new_field_names)
               cf2.writerow(row)

ok im trying to get it to write the headers when it runs the first line then when it loops through a second time it sees the file exist and jsut puts the new information in without printing headers again

You only need to keep the set of already seen PC Numbers

seen = set()
# ...

for row in ...:
    n = row['PC Number'].strip()
    mode = 'ab+' if n in seen else 'wb'
    with open('pc_Numbers_' + n +'.csv', mode) as fid:
        cf2 = csv.DictWriter(fid, new_field_names)
        if n not in seen:
            cf2.writeheader()
            seen.add(n)
        cf2.writerow(row)

However, it may be inefficient to open a file for every row. You could store a dictionary of opened files and dictwriters, with the pc numbers as keys.

Edited 2 Years Ago by Gribouillis

This question has already been answered. Start a new discussion instead.