I tried using a solution to a very similar question here but it doesnt work for large files like the one I'm using.
http://www.daniweb.com/software-development/vbnet/threads/320160/visual-basic-reading-text-file-into-array

The code from that question:

    Dim OpenAnswerFile As New OpenFileDialog
    Dim strFileName() As String '// String Array.
    Dim tempStr As String = "" '// temp String for result.
    If OpenAnswerFile.ShowDialog = DialogResult.OK Then '// check if OK was pressed.
    strFileName = IO.File.ReadAllLines(OpenAnswerFile.FileName) '// add each line as String Array.
    For Each myLine In strFileName '// loop thru Arrays.
    tempStr &= myLine & vbNewLine '// add Array and new line.
    Next
    MsgBox(tempStr) '// display result.
    End If

This works perfectly for small files but when I try to use it with a 350,000+ line dictionary the program just hangs... and hangs...

I changed
MsgBox(tempStr)
to
outputRichTxtBox.Text = tempStr

which of course made no difference at all.

I'm trying to speed up access to my dictionary file so I can do multiple searches against it quickly. Currently I access the text file directly which functions fine but the hard drive read speed is slow for just one search (3-5 seconds) so if I want to do potentially hundreds of searches it needs to be a lot faster.

Edit
Dictionary file is 1 word per line .txt file
cow
cat
frog
moose
etc
etc
x350,000 more words long

Recommended Answers

All 11 Replies

Line 7 looks like it's trying to put all 350,000 words into a single string?? You might be overloading the number of characters that can be put into a single string (link). Another problem would be memory management, each time a new word is added gto the string vb.net has to increase the size of the string, copy the original string to the new memory location, add the new string to the new memory location then delete the original string. That's a whole lot of work for 350,000 words.

You can read the entire file by

Dim allLines() As String = System.IO.File.ReadAllLines(filename)
Dim alltext As String = System.IO.File.ReadAllText(filename)

The first line reads the file into an array (one line per entry) while the second reads the entire file into one string (embedded vbCrLf), which is apparently what you were trying to do with the loop. You can get the array from the text as well by

Dim allLines() As String = alltext.Split(vbCrLf)

I suggest you take a little more care naming your variables. strFileName is highly misleading as it does not contain a file name. It contains the file contents. If you sort your dictionary file (it is not sorted based on your example) then you can use a binary search once you have read the file into the array. If you leave it unsorted then you have to do a linear search which is much slower.

commented: Excellent! Worked like a charm! +0

working with 16 GB of RAM and the text file is <4mB so I dont think memory is the problem... overloading the string size limit could be the problem... I had not thought of that but I would have thought I'd get an error message rather than endless program stall.

Can you suggest another method that I can try?

I simply need the dictionary stored in memory (4mB text file should not be that difficult to handle...)

I need to be able to search the dictionary for matches (which should be pretty fast if it's all in RAM)

If it would help, the dictionary text file I'm using to test handling long word lists is here:
http://www.word-play.us/files/Single_Dictionary.7z

Edit
missed Jim's post. Thank you. I will give that a try when I get home and report back on how it works out.
The example code is directly from an example here, not mine. You'd have to take up those naming conventions with the author on the linked thread :)

Can you suggest another method that I can try?

If you read the words directly into a combobox set to simple dropdown style, whatever the user types is automatically searched for and the closest matches are scrolled to. It's a simple matter of using the Datasource property.

ComboBox1.DataSource = IO.File.ReadAllLines("Single_Dictionary.TXT").OrderBy(Function(x) x).ToList()

My system only has 4GB of RAM and it loads and searches just fine. Mind you it does take a little time to load, but the searching is real time.

Reverend Jim,
Your suggestion works beautifully. What used to take me 3-5 seconds now executes instantly with this button code.

    Private Sub searchBtn_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles searchBtn.Click
        Dim dictionary() As String = System.IO.File.ReadAllLines(filePathTxtBox.Text)

        For Each item In dictionary
            If item = "zebra" Then '//example dictionary search for the word zebra
                outputTxtBox.Text += item
            End If
        Next
    End Sub

It doesnt even require noticeable time to load it all into memory... it's as if it's just there! :)

If you sort your dictionary file then you can use a binary search once you have read the file into the array. If you leave it unsorted then you have to do a linear search which is much slower.

hmm.. I dont know how to do a binary search. I assume the code I used above is a linear search? I could be wrong because I honestly dont know. If I can refine this any further, well, faster is better because as I said, I will be applying hundreds or thousands of searches and it should be as fast as possible.

To use the BinarySearch you have to cast the array to a list. It would look something like this:

Dim dictionary As List(Of String) = System.IO.File.ReadAllLines(filePathTxtBox.Text).OrderBy(Function(x) Asc(x)).ToList
Dim index as Integer = dictionary.BinarySearch("zebra")
If index >=0 Then
    outputTxtBox.Text = dictionary(index)
Else
    outputTxtBox.Text = "Search Failed"
End If
commented: This answered my question and is a good thing to know how to do. Thank you for the answer! +0

Even faster, store your dictionary in an actual dictionary as follows

Public Class Form1

    Private dic As New Dictionary(Of String, String)

    Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
        For Each line As String In System.IO.File.ReadAllLines(filename)
            dic.Add(line, "")
        Next
    End Sub

The load time is extremely small and to check if a word is in the dictionary you just do

If dic.ContainsKey(word)

No actual searching (on your part) has to be done. One caution - when I tried this with your wordlist I found a duplicate word (animate). A dictionary takes two entries, a key and a value. In this case you don't need to supply a value; you are only interested in the key. Keep in mind that you'll probably want to do

If dic.ContainsKey(word.ToLower())

because the keys are case sensitive.

commented: Answered my question AND helped fix my dictionary file! Thanks! +0

wow excellent. Not only have you answered my question perfectly but you also helped me fix my dictionary file! :)

    Private dic As New Dictionary(Of String, String)
    Private Sub searchBtn_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles searchBtn.Click
        Dim searchWord As String = "ThAnKs"

        For Each line As String In System.IO.File.ReadAllLines(filePathTxtBox.Text)
            dic.Add(line.ToLower(), "") 'ensures entries are all lowercase
        Next

        If dic.ContainsKey(searchWord.ToLower()) Then
            outputTxtBox.Text += searchWord.ToLower() + vbNewLine
        End If

    End Sub

If I also wanted to store a value with the key, how would I print that?

for example:

        For Each line As String In System.IO.File.ReadAllLines(filePathTxtBox.Text)
            dic.Add(line.ToLower, StrReverse(line.ToLower))
        Next

        If dic.ContainsValue("tac") Then
            outputTxtBox.Text += '// how to print out the associated key "cat"?
        End If

Edit assuming I did not know it was a simple string reverse. I just want to know how to access the key associated with a particular value and vice-versa

If, for example, you wanted to store a value with the key you would do something like

Dim word As String = "cow"
Dim def As String = "a four legged herbivore"

dic.Add(word,def)

Then you can display the value as

MsgBox(dic(word))

You can use different types for key and value by modifying the declaration. For example, you could have an integer key and a string array as the value by

Dim dic As New Dictionary(Of Integer, String())

You could also declare your own object (class) and use that as the value.

ok I see... Value does not have to be unique like the Key does. That was confusing me.

So, I can get the value by calling the key

MsgBox(dic(word))

but there is no way to get the key by calling the value.

You can iterate over the keys by

For Each k as String In dic.Keys

and the values by

For Each v as String In dic.Values

and, of course, you can use the ContainsKey and ContainsValue to determine the presence of a particular key or value, however, you cannot determine if a value is unique without scanning the entire collection of values.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.