This article has been dead for over three months
You
I've seen a lot of posts over time relating to importing and exporting CSV. Most of the answers involve using TextReader.readline and String.split, which will not work with any but the simplest data. When you are writing for users who have the expectation that your application will work with anything they export from Excel, you will have to cover all the CSV bases.
I hope that this post will give you a clear idea of what you're involving yourself in when you decide you need to read a CSV file.
The characteristics of CSV data are:The comma character is used to separate the fields. (This is obvious, but I thought I'd start with a couple of obvious points to warm up)
The last field on the line is ended by the end-of-line character or the physical end of the file.
Any field that contains commas, double-quotes or control characters will be surrounded by double-quote marks.
Any double-quote marks within the field will be "escaped" with an extra double-quote character. For instance, the value Paul "Chuck" Norris would be converted to "Paul ""Chuck"" Norris".
After the opening double-quote ANY character is valid. Including line breaks. This is why you can't use readline.
I have a parser class that works nicely and I'll post that on this thread in a little while, along with some notes on using it.
Cheers all!
My answer to the CSV import problem is the CsvParser class. The code will follow shortly, but using the class is illustrated with a little example. This code takes a CSV input file, which we will assume has a header row, with some specific header values assumed to be present. The data is assumed to follow in subsequent rows and the values are assumed to be in the columns corresponding to the headers.
Dim parser As New CsvParser(new IO.StreamReader(txtFilePath.Text))
dim hdr as List(Of String) = parser.parseToList
dim idxName as Integer = hdr.IndexOf("Name")
dim idxEmail as Integer = hdr.IndexOf("Email")
While Not parser.eof
dim value as List(Of String) = parser.parseToList
dim name as String = value(idxName)
dim email as String = value(idxEmail)
' Do something with the values...
End While
parser.close()
And that's it. The class itself follows:
Public Class CsvParser
Private Const CR As Integer = Asc(vbCr)
Private Const LF As Integer = Asc(vbLf)
Private src As TextReader
Private ch As Integer
Public Sub New(ByVal reader As TextReader)
src = reader
advance()
End Sub
Public Sub close()
src.Close()
End Sub
Protected Sub advance()
ch = src.Read
End Sub
Public ReadOnly Property eof() As Boolean
Get
Return (ch < 0)
End Get
End Property
Public ReadOnly Property eol() As Boolean
Get
Return (ch = CR) OrElse (ch = LF) OrElse eof
End Get
End Property
Public Function parseToList() As List(Of String)
Dim fld As New List(Of String)
While Not eol
fld.Add(nextField)
End While
If Not eof Then
Dim nextCh As Integer = src.Read
If ch = CR And nextCh = LF Then
advance()
Else
ch = nextCh
End If
End If
Return fld
End Function
Protected Function nextField() As String
Dim firstCh As Char = Chr(ch)
Dim fld As String
Select Case firstCh
Case "0"c To "9"c
fld = numericField()
Case """"
fld = quotedField()
Case Else
fld = unquotedField()
End Select
Return fld
End Function
Protected Function numericField() As String
Dim fld As String = ""
Dim theCh As Char = Chr(ch)
While Not eol AndAlso theCh <> ","c
fld += theCh
advance()
theCh = Chr(ch)
End While
If theCh = ","c Then advance()
Return fld
End Function
Protected Function unquotedField() As String
Return numericField()
End Function
Protected Function quotedField() As String
Dim fld = ""
Dim stillGoing As Boolean = True
advance() ' Skip the opening quote
Dim thisCh As Char = Chr(ch)
Dim follCh As Char = Chr(src.Peek)
While stillGoing
If thisCh = """"c And follCh = """"c Then ' escaped double-quote
fld += """"
advance() ' Skip the extra quote mark
ElseIf thisCh = """"c Then
stillGoing = False
Else
fld += thisCh
End If
advance()
thisCh = Chr(ch)
follCh = Chr(src.Peek)
End While
If thisCh = ","c Then advance() ' Skip the delimiter
Return fld
End Function
End Class