Hey everyone. I'm currently trying to figure out how to use the StreamReader and ReadAllLines functions of VB.NET to search a text file for keywords and pull the information that follows those keywords. To elaborate more, I have a PDF which I convert to text and once converted, the text file has a block that reads as follows:

                                                                             DATE: May 21, 2015
                                        DAILY REPORT
LOCATION NO: 12345                                                           SHOW ID: CRT-0115-0113
USER: Rod Serling                                                           INQUIRY NO: 2015-900-0000

Now, the text file continues on with a description but this is the only block I'm concerned about. My idea is to have the StreamReader read the entire file until the end while looking for keywords "Date:", "LOCATION NO:", "USER:", "SHOW ID:", and "INQUIRY NO:". When it hits those, it pulls the information that follows it into a corresponding textbox. My trouble is that I'm having a hard time working out the logic for this problem being fresh to VB.NET.

I understand that I want to initialize the IO.StreamReader, have it read the entire file. I also have to declare a variable and make that my tool for looking for the keywords. Something along the lines of:

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim Reader As New IO.StreamReader("C:\" & TheFileName.Text & ".txt") 'This will allow me to type in name of file in txtbox
        Dim line AS String 'Read my whole file
        Do
            line = Reader.ReadLine()
            Dim K as Integer=line.IndexOf("Insert my keywords here")
       'And it all falls apart on me on what to do after this.
    End Sub

Any help or pointing to tutorials on this would be greatly apperciated.

Recommended Answers

All 7 Replies

You need a conditional block(If statement). Also you'll probably find that a while loop will work better than a do loop.

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
    Dim Reader As New IO.StreamReader("C:\" & TheFileName.Text & ".txt") 'This will allow me to type in name of file in txtbox
    While Not Reader.EndOfStream
        Dim line As String = Reader.ReadLine()
        If line.Contains("Date:") Then
            'Once you find the first line of data that you
            'want.  Successive Readline operations will get
            'the rest.

            Exit While
        End If
    End While
End Sub

Thanks for the help tinstaafl. I won't lie, after reading your advice, I spent an hour trying to figure out why the straight forward test line

        Dim Extractor As New IO.StreamReader(BrowseTextbox.Text) 'Pulls the location of file to extract
        While Not Extractor.EndOfStream 'While loop, if theres more lines keep extracting
            Dim line As String = Extractor.ReadLine()
            If line.Contains("DATE:") Then
                DateTextBox.Text = Extractor.ReadLine()
                Extractor.Close()

wasn't producing anything. Then after looking at the StreamReader Class API for the 7th time, it finally stood out to me. "Returns the data as a string". I felt extremely silly at that moment. So after expanding the code, I was able to get the text to hit the required text boxes but I still need to figure out how to convert the string back into Int, so numbers combined with letters like the date "May 21, 2015" can hit the required boxes. I'm thinking I have to use the ".TryParse" to get this going or am I way off?

int and datetime classes both have a tryparse function which will work for this. the datetime class would probably be more suitable since it will only need one parse operation to get all the parts of the date.

I prefer to avoid using StreamReader unless the input file is exceedingly large. You are scanning for a series of keywords so I suggest you loop through the keywords rather than looping through the lines of text. You could do something like

Dim lines() As String = System.IO.File.ReadAllLines("D:\temp\test.txt")
Dim keywords() As String = {"DATE: ", "LOCATION NO: ", "SHOW ID: "}

For Each key In keywords
    Dim line() As String = Filter(lines, key)
    'extract tag
Next

Filter returns all lines containing the given string. To extract the tag (data) portion you first extract everything following the keyword (notice that I included the trailing blank in each keyword). Then you have to throw away everything following the data. The end of the data is marked by either the end of the line or a blank. You can do the first extraction by

Dim tag As String = line(0).Substring(InStr(line(0), key) + key.Length - 1)

You can find the end of the tag by taking the highest value of location of first blank or end of string as

Dim endtag As Integer = Math.Max(InStr(tag, " "), tag.Length)

and to extract the complete tag you can do

tag = tag.Substring(0, endtag)

so all together it is

Dim lines() As String = System.IO.File.ReadAllLines("D:\temp\test.txt")
Dim keywords() As String = {"DATE: ", "LOCATION NO: ", "SHOW ID: "}

For Each key In keywords
    Dim line() As String = Filter(lines, key)
    Dim tag As String = line(0).Substring(InStr(line(0), key) + key.Length - 1)
    Dim endtag As Integer = Math.Max(InStr(tag, " "), tag.Length)
    tag = tag.Substring(0, endtag)
Next

Just add some code before the Next to do what you want with the given tag.

Thanks for the input and explanation Reverend Jim, it works like it's suppose to and I now understand why, allowing me to build off this for the future. But I am struggling with moving the information to text boxes logic of this. Since, this is all happening within a For Each loop, "tag" is being replaced by the next value that follows the keyword, like it's suppose to. But I'm struggling with how to express in VB.NET code to define within the loop, "hey before you move on to the next keyword, drop your current result in this textbox I've designated".

My attempts have lead to results where the last value is broken up and dropped into all text boxes by character. So if the last thing pulled was "Jimmy", it would just drop J at Text Box one, and proceed to repeat until Jimmy is spelled out in all text boxes. Cool, but not what I'm trying to accomplish.

Here's a little trick. Name your controls so that you can calculate the name from the keyword. For example

DATE:            txtDATE
LOCATION NO:     txtLOCA
SHOW ID:         txtSHOW
INQUIRY NO:      txtINQU
USER:            txtUSER

Then you can dynamically reference the control in the loop by Me.Controls(ctrlName)

For Each key In keywords
    Dim line() As String = Filter(lines, key)
    Dim tag As String = line(0).Substring(InStr(line(0), key) + key.Length - 1)
    Dim endtag As Integer = Math.Max(InStr(tag, " "), tag.Length)
    tag = tag.Substring(0, endtag)
    Dim tbx As TextBox = Me.Controls("txt" & key.Substring(0, 4))
    tbx.Text = tag
Next

Thanks for the help Reverend Jim. Still getting a few hiccups in getting all the textboxes to populate but you have put me on the right track.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.