944,057 Members | Top Members by Rank

Ad:
Apr 26th, 2006
0

Huge flat file

Expand Post »
I would like to create a spell check facility. I have a huge flat file of words. This file has no cr/lf. How should I do this ? I think I want to read this file say 500 characters at a time. I'll store this data in a variable, adding each 500 characters to what I have already got. Should the target variale be string ? Will that be big enough ? What else - isn't there a 'huge' or 'memo' type, or something like that ?
Similar Threads
Reputation Points: 10
Solved Threads: 0
Newbie Poster
robertlees is offline Offline
20 posts
since Sep 2005
Apr 26th, 2006
0

Re: Huge flat file

Bad idea. Your best bet is to work with the file line by line. If it has no cr/lf or newline, then how are the words separated, by space? If so, I would make a mini-program to actually add a vbnewline after every word, and re-save the file, 1 word per line. It just makes it easier to work with in the spellcheck program, but if you want to deal with it with no newlines, then I suggest reading in from the file, 1 character at a time. Check if the character is space (or whatever the words are delimited by) and then that's the first word....
Team Colleague
Reputation Points: 361
Solved Threads: 214
Taboo Programmer
Comatose is offline Offline
2,413 posts
since Dec 2004
Apr 26th, 2006
0

Re: Huge flat file

Thanks for the quick reply. Yes, that makes sense. But how can I read the initial file to put a newline between each word. This assumes that I can read the file bit my bit. But if I can do this to create a newline delimited file, then I could just read in the initial file. Yes, the words are separated by spaces. If I do read the new file line/word by line/word, what should I do with the results ? Make it a huge string, or someother variable type ?
Reputation Points: 10
Solved Threads: 0
Newbie Poster
robertlees is offline Offline
20 posts
since Sep 2005
Apr 26th, 2006
0

Re: Huge flat file

No No, do your test for whatever word you are checking it against, within the file. I mean, it depends entirely on how big the file is.... a .txt file would need an absurd amount of words in order to be too big to load into a variable. You have to take into consideration the type machine that this will be run on. My system is extremely well equipped, and I wouldn't worry about loading all or most of the file into an array and looping through it. However, if there is a machine that it will be run on, that's less than average, or even average, loading the whole file into a variable would be really slow, and probably slow the machine down a whole lot. It may even make the program appear to be frozen; these are all things you need to take into consideration.

I would build the mini-program, just based on efficiency, and layout. It would be a lot cleaner in the spell check program (the code would be cleaner) by having each entry in the file on it's own line. Either way, you are going to have to test each word that the user has typed, against each word in the file.... if you want to load the entire file into memory, then I would do this (assuming the file is 1 long line):
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
  1. open xfile for input as #1
  2. line input #1, tmp
  3. words = split(tmp, " ")
  4. close #1
And that will make an array (called words) that contains all the words in the file....

Let's face it, you're going to have your hands full just dealing with the spellchecker. You are going to have to see if the word is in the dictionary file already, then if it's not, you can assume it's spelled wrong. If it's not in the file already, then you have to find the word that has the most similar spelling (starts with the same letters, etc), and has most of the same characters. This is not an easy task, and the last thing you want to have to do, is to try to remember string positions (where in the file was the space to the last word I was at?) and deal with the mess of parsing apart a huge file. I've attached a program that will take the word list you have, and put it into a file, 1 word per line. It's fully commented, so you can fiddle with it if you want to.
Attached Files
File Type: zip string2flat.zip (2.0 KB, 22 views)
Last edited by Comatose; Apr 26th, 2006 at 9:34 pm.
Team Colleague
Reputation Points: 361
Solved Threads: 214
Taboo Programmer
Comatose is offline Offline
2,413 posts
since Dec 2004
Apr 26th, 2006
0

Re: Huge flat file

Thank you for your detailed response. I didn't think I could just grab the entire file in one statement. If I can do that, I could just do the following...
bolFound = instr(strAllWords," " & strThisWord & " ")

I would be happy with not being able to find words that are close.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
robertlees is offline Offline
20 posts
since Sep 2005
Apr 26th, 2006
0

Re: Huge flat file

Right, I have a hard time thinking that loading the entire file into memory is a good idea... at least if each word is on it's own line, you can say:
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
  1. open xfile for input as #1
  2. do until eof(1)
  3. line input #1, tmpvar
  4. if tmpvar = wordinquestion then
  5. ' /* Word is Good (in dictionary) */
  6. end if
  7. loop
  8. close #1
Let me know what you come up with though.... it will be interesting to see how it turns out.
Team Colleague
Reputation Points: 361
Solved Threads: 214
Taboo Programmer
Comatose is offline Offline
2,413 posts
since Dec 2004
Apr 26th, 2006
0

Re: Huge flat file

..I agree. Not sure what else I can do, but I'll work on it. I'll tell you if I come up with anything.

Regards
Robert
Reputation Points: 10
Solved Threads: 0
Newbie Poster
robertlees is offline Offline
20 posts
since Sep 2005

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
Message:
Previous Thread in Visual Basic 4 / 5 / 6 Forum Timeline: vb and msword
Next Thread in Visual Basic 4 / 5 / 6 Forum Timeline: Password protection





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC