Hi Folks,
I have a set of csv files that I open and read the contents of a row into a DictReader, this works fine 99% of the time, but occasionally one of the fields in a record has an extra new line character. For example here's the format of said file
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~fieldF~field G
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~field F~field G
field A~field B~field C~field D~field E~field F~field G
...
...
The python code I have for reading through a csv file is
import csv
fields = ["A","B","C","D","E","F","G"]
delim = "~"
lineReader = csv.DictReader(open('./input/26.dat', 'rb'), delimiter=delim,fieldnames=fields)
fileRows = []
for row in lineReader:
fileRows.append(row)
Which works great for MOST csv files I read, not so for 'bad' csv files like the example above. The error I get when reading a csv file of this format is
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/csv.py", line 104, in next
row = self.reader.next()
_csv.Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?
I've tried to google the above error but I can't find anything specific to my scenario. Any suggestions?