I would like to create a spell check facility. I have a huge flat file of words. This file has no cr/lf. How should I do this ? I think I want to read this file say 500 characters at a time. I'll store this data in a variable, adding each 500 characters to what I have already got. Should the target variale be string ? Will that be big enough ? What else - isn't there a 'huge' or 'memo' type, or something like that ?

Recommended Answers

All 6 Replies

Bad idea. Your best bet is to work with the file line by line. If it has no cr/lf or newline, then how are the words separated, by space? If so, I would make a mini-program to actually add a vbnewline after every word, and re-save the file, 1 word per line. It just makes it easier to work with in the spellcheck program, but if you want to deal with it with no newlines, then I suggest reading in from the file, 1 character at a time. Check if the character is space (or whatever the words are delimited by) and then that's the first word....

Thanks for the quick reply. Yes, that makes sense. But how can I read the initial file to put a newline between each word. This assumes that I can read the file bit my bit. But if I can do this to create a newline delimited file, then I could just read in the initial file. Yes, the words are separated by spaces. If I do read the new file line/word by line/word, what should I do with the results ? Make it a huge string, or someother variable type ?

No No, do your test for whatever word you are checking it against, within the file. I mean, it depends entirely on how big the file is.... a .txt file would need an absurd amount of words in order to be too big to load into a variable. You have to take into consideration the type machine that this will be run on. My system is extremely well equipped, and I wouldn't worry about loading all or most of the file into an array and looping through it. However, if there is a machine that it will be run on, that's less than average, or even average, loading the whole file into a variable would be really slow, and probably slow the machine down a whole lot. It may even make the program appear to be frozen; these are all things you need to take into consideration.

I would build the mini-program, just based on efficiency, and layout. It would be a lot cleaner in the spell check program (the code would be cleaner) by having each entry in the file on it's own line. Either way, you are going to have to test each word that the user has typed, against each word in the file.... if you want to load the entire file into memory, then I would do this (assuming the file is 1 long line):

open xfile for input as #1
     line input #1, tmp
     words = split(tmp, " ")
close #1

And that will make an array (called words) that contains all the words in the file....

Let's face it, you're going to have your hands full just dealing with the spellchecker. You are going to have to see if the word is in the dictionary file already, then if it's not, you can assume it's spelled wrong. If it's not in the file already, then you have to find the word that has the most similar spelling (starts with the same letters, etc), and has most of the same characters. This is not an easy task, and the last thing you want to have to do, is to try to remember string positions (where in the file was the space to the last word I was at?) and deal with the mess of parsing apart a huge file. I've attached a program that will take the word list you have, and put it into a file, 1 word per line. It's fully commented, so you can fiddle with it if you want to.

Thank you for your detailed response. I didn't think I could just grab the entire file in one statement. If I can do that, I could just do the following...
bolFound = instr(strAllWords," " & strThisWord & " ")

I would be happy with not being able to find words that are close.

Right, I have a hard time thinking that loading the entire file into memory is a good idea... at least if each word is on it's own line, you can say:

open xfile for input as #1
do until eof(1)
     line input #1, tmpvar
     if tmpvar = wordinquestion then
         ' /* Word is Good (in dictionary) */
     end if
loop
close #1

Let me know what you come up with though.... it will be interesting to see how it turns out.

..I agree. Not sure what else I can do, but I'll work on it. I'll tell you if I come up with anything.

Regards
Robert

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.