Ok so im working on a script that reads a large file then splits it up into seperate files using the field in it called PC Number. What im wanting to do is check if the file already exist if it does then it just appends the new data but if it doesnt it writes a new file with headers and everything. If there is a simple way to do this i would b greatly appriciated to know how to do it cause so far its got me really confused. Below is the the code that im useing to read and write the data.

import csv, datetime, os, shutil, glob, os.path

newrow={'PC Number': '', 'INSTALL NEW LUXURY SIGN - YES/NO': '',  'Status': '',  'Street Address': '',  'City': '', 'State': '',  'Zip': '',  'County': '', 'MLS#': '',  'Price': '',  'Agent Name': '',  'Office Name': '', 'Region': '',  'Office Address 1': '',  'Office Address 2': '',  'Office City': '',  'Office State': '',  'Office Zip': '',  'Office Phone': '',  'Number Manager': '',  'Name Manager': '',  'Email Address': '',  'Admin Name': '',  'Admin Email Address': '', 'Listing Expiration Date': ''}
new_field_names = newrow.keys()
dt = datetime.datetime.now().strftime("%m_%d_%y_%H_%M_%S")

os.chdir("/Users/HatterX/Desktop/Unprocessed Trello")
for FILE in glob.glob("LuxuryListings-040714*"):
    with open(FILE, 'r') as f1:
        cf1 = csv.DictReader(f1, fieldnames=('PC Number', 'INSTALL NEW LUXURY SIGN - YES/NO',  'Status',  'Street Address',  'City',  'State',  'Zip',  'County',  'MLS#',  'Price',  'Agent Name',  'Office Name',  'Region',  'Office Address 1',  'Office Address 2',  'Office City',  'Office State',  'Office Zip',  'Office Phone',  'Number Manager',  'Name Manager',  'Email Address',  'Admin Name',  'Admin Email Address', 'Listing Expiration Date' ))
        #cf2.writeheader()
        for row in cf1:
            with open('pc_Numbers_'+row['PC Number'].strip()+'.csv', 'ab+') as fid:
               cf2 = csv.DictWriter(fid, new_field_names)
               cf2.writerow(row)

Recommended Answers

All 4 Replies

Use os.path.exists() or os.path.isfile() depending on what you want to do.

ok im trying to get it to write the headers when it runs the first line then when it loops through a second time it sees the file exist and jsut puts the new information in without printing headers again

You only need to keep the set of already seen PC Numbers

seen = set()
# ...

for row in ...:
    n = row['PC Number'].strip()
    mode = 'ab+' if n in seen else 'wb'
    with open('pc_Numbers_' + n +'.csv', mode) as fid:
        cf2 = csv.DictWriter(fid, new_field_names)
        if n not in seen:
            cf2.writeheader()
            seen.add(n)
        cf2.writerow(row)

However, it may be inefficient to open a file for every row. You could store a dictionary of opened files and dictwriters, with the pc numbers as keys.

Thank you for your help it helped me solve my issue i was having

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.