I'm really kind of stuck here, and would really appreciate some help. I have a function that reads from a csv that can get quite large, and it will block some other routines from running in there allocated time slice. I need to make this code “non-blocking”. I think there might be a way to accomplish using the fnctl module, but I'm not sure how to implement it. The code is as follows:

def csv_parser(self):
	# read and format csv to csv_data 
	try:
            openfile = open(g.FS_TMP_TREND, "r")
            reader = csv.reader(openfile, dialect='excel', delimiter="|")
            for row in reader:
                for i in range(12):
                    if row[i] == '':
			v = 0
                    else:
		        v= float(row[i])
                    self.csvData[i].append(v)
                self.csvWriteIndex += 1
	except:
            err = error_dialog(_('There was a problem loading csv data'))
            err.show()

thank you for any help you can offer

Recommended Answers

All 2 Replies

if row[i] == '':
			v = 0
                    else:
		        v= float(row[i])
                    self.csvData[i].append(v)

In this code, it could be the append statement that is taking the time since it is a large file. Store "i" and "v" in a dictionary until the end. After the loop is finished, instead of appending, it would probably be faster to loop through self.csvData, and append to a second list with the changes added. I don't know the internals of lists, but appending to the first list probably means that data has to be moved when something additional is inserted, which takes time. Writing to a second list, or file, just adds it on at the end. Also, using a dictionary would be even faster, or multiple lists with smaller amount of data per list. It would be simple to time this function, just two calls to datetime.datetime.now(), etc. to see if this is hanging the program.

As far as non-blocking goes, on Linux you can nice to a higher number (see man nice) which means it runs at a lower priority. Blocking would be more of an OS implementation, unless you are talking about blocking something in the same program.

Edit: What is self.csvData.append(v) and do you mean
self.csvData = v
and can you just write each line as it is encountered

for row in reader:
                for i in range(12):
                    output_row= []     ## empty list for each line
                    if row[i] == '':
			v = 0
                    else:
		        v= float(row[i])
                    output_row.append(v)
                ## row is finished so write to file
                writer.write(output_row)  ## or whatever you are using

It appears you are right and the large delay we are experiencing is from the nested "for" loops and not the file read. Which to us is very puzzling, and concerning, because this program is part of a real time system and this task is set very low and should not block the higher priority tasks. We just assumed that it was something to do with OS operations in the read function that was conflicting with the real time tasks. Oops, i guess thats what i get for assuming.

thank you for your help
Chad

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.