Im working on a code that loops through a folder break up the file names in it into specific parts and then reads off sertain parts of the broken name and writes it to a csv sile. The files ser formated as follows test_PAQT_B2H.csv, test_PAQT_B4.csv, and test_PINI_B1H.csv. when it jsut has one file type like just AQT files it works fine but when there is INI files in it when it writes to the csv file for results it writes the ini file multiple times messing up the readability of the data. I jsut want to know how to get it to read each file only once.

import datetime,glob,os,csv,fnmatch,StringIO,smtplib,argparse,math,re

parser = argparse.ArgumentParser(description='Search art folders.')
parser.add_argument('-b', help='The base path', required=False, dest='basePath', metavar='Base directory path',default='/home/hatterx/Desktop/beds')
parser.add_argument('-o', help='File Output Location', required=False, dest ='fileOutput', metavar='File Output', default='/home/hatterx/Desktop/bedsused')
args = parser.parse_args()

outputCount= args.fileOutput
DT = datetime.datetime.now().strftime("%Y_%m_%d")
dt = datetime.datetime.now().strftime("%Y/%m/%d %I:%M:%S%p")

def fileBreak(pathname):
    filepresent = os.path.isfile(args.fileOutput+'/filecount_'+DT+'.csv')   
    newrow={'Date':'', 'Total Files':'', 'Total Beds':'', 'Total SQFT':'', 'AQT Files':'', 'INI Files':'','AQT Beds':'','INI Beds':'','AQT Total SQFT':'','INI Total SQFT':'', 'AQT Half Beds':'','INI Half Beds':''}
    new_field_names = newrow.keys()

    filecount = {}
    bedcount = {}
    halfbedcount = {}
    sqftFactor = {"AQT":64, "INI":50, "n/a":10}

    for filename in os.listdir(pathname):
        print filename
        Extbreak = re.split('[.]', filename)[0]
        Printbreak = re.split('_p', Extbreak, flags=re.I)[1]
        Typebreak = re.split('_b', Printbreak, flags=re.I)[0]
        Bedbreak = re.split('_b', Extbreak, flags=re.I)[1]
        Halfsearch = re.search('h', Bedbreak, flags=re.I)
        if Halfsearch:
            Numbreak = re.split('h', Bedbreak, flags=re.I)[0]
            #print int(Numbreak)*.5

            Numbreak = re.split('h', Bedbreak, flags=re.I)[0]
            #print Numbreak

        if Typebreak not in filecount:
            filecount[Typebreak] = 0

        if Typebreak not in bedcount:
            bedcount[Typebreak] = 0

        if Typebreak not in halfbedcount:
            halfbedcount[Typebreak] = 0

        filecount[Typebreak] = filecount[Typebreak]+1
        if Halfsearch:
            halfbedcount[Typebreak] = halfbedcount[Typebreak] + int(Numbreak)*.5
        bedcount[Typebreak] = bedcount[Typebreak] + int(Numbreak)
        for type in filecount:
            print dt, type, str(filecount[type]), str(bedcount[type] - halfbedcount[type]), str(sqftFactor[type] * bedcount[type]-(sqftFactor[type]*halfbedcount[type]))
            with open(args.fileOutput+'/filecount.csv','ab') as f:
                data = [filename]
                writer = csv.writer(f)
                for item in data:
                data = [dt]
                writer = csv.writer(f)
                for item in data:
                data = [type]
                writer = csv.writer(f)
                for item in data:
                data = [type+" files: "+str(filecount[type])] 
                writer = csv.writer(f)
                for item in data:
                data = [type+" bed count: "+str(bedcount[type] - halfbedcount[type])]
                writer = csv.writer(f)
                for item in data:
                data = [type+" SQFT: "+str(sqftFactor[type] * bedcount[type]-(sqftFactor[type]*halfbedcount[type]))]
                writer = csv.writer(f)
                for item in data:
2 Years
Discussion Span
Last Post by abaddon2031

Your code is very difficult to understand. I think you can do a lot with a single regular expression, like in this example

# -*- coding: utf-8 -*-
Created on Fri Jul 25 19:23:53 2014
python 2 or 3
@author: Gribouillis
import re

pattern = re.compile(
    flags = re.I

if __name__ == "__main__":
    for filename in [
        match = pattern.match(filename)

""" my output --->
{'prefix': 'test', 'num': '2', 'type': 'AQT', 'bed': '2H', 'half': 'H'}
{'prefix': 'test', 'num': '4', 'type': 'AQT', 'bed': '4', 'half': ''}
{'prefix': 'test', 'num': '1', 'type': 'INI', 'bed': '1H', 'half': 'H'}

Edited by Gribouillis


Thank you and i ahve hit a new problem that falls inot htis same question. I changed the 'ab' to 'wb' and now it only writes the last item that it processes so it ends up being jsut one field with the information for INI where i need it to have bothe AQT and INI


The strange thing is that you open the file within a loop. It means that the same file is opened repeatedly by the program. Normally you open the output file once and loop to write a series of records.


OK so how do i fix this because i jsut had it print out the output all as one data line and it printed over 90 lines and it was jsut files being looped over again and again till it was finished


Ok i figured it out. I moved it above the for statement and it does everything i want it to do now

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.