| | |
Huge flat file
![]() |
•
•
Join Date: Sep 2005
Posts: 20
Reputation:
Solved Threads: 0
I would like to create a spell check facility. I have a huge flat file of words. This file has no cr/lf. How should I do this ? I think I want to read this file say 500 characters at a time. I'll store this data in a variable, adding each 500 characters to what I have already got. Should the target variale be string ? Will that be big enough ? What else - isn't there a 'huge' or 'memo' type, or something like that ?
Bad idea. Your best bet is to work with the file line by line. If it has no cr/lf or newline, then how are the words separated, by space? If so, I would make a mini-program to actually add a vbnewline after every word, and re-save the file, 1 word per line. It just makes it easier to work with in the spellcheck program, but if you want to deal with it with no newlines, then I suggest reading in from the file, 1 character at a time. Check if the character is space (or whatever the words are delimited by) and then that's the first word....
•
•
Join Date: Sep 2005
Posts: 20
Reputation:
Solved Threads: 0
Thanks for the quick reply. Yes, that makes sense. But how can I read the initial file to put a newline between each word. This assumes that I can read the file bit my bit. But if I can do this to create a newline delimited file, then I could just read in the initial file. Yes, the words are separated by spaces. If I do read the new file line/word by line/word, what should I do with the results ? Make it a huge string, or someother variable type ?
No No, do your test for whatever word you are checking it against, within the file. I mean, it depends entirely on how big the file is.... a .txt file would need an absurd amount of words in order to be too big to load into a variable. You have to take into consideration the type machine that this will be run on. My system is extremely well equipped, and I wouldn't worry about loading all or most of the file into an array and looping through it. However, if there is a machine that it will be run on, that's less than average, or even average, loading the whole file into a variable would be really slow, and probably slow the machine down a whole lot. It may even make the program appear to be frozen; these are all things you need to take into consideration.
I would build the mini-program, just based on efficiency, and layout. It would be a lot cleaner in the spell check program (the code would be cleaner) by having each entry in the file on it's own line. Either way, you are going to have to test each word that the user has typed, against each word in the file.... if you want to load the entire file into memory, then I would do this (assuming the file is 1 long line):
And that will make an array (called words) that contains all the words in the file....
Let's face it, you're going to have your hands full just dealing with the spellchecker. You are going to have to see if the word is in the dictionary file already, then if it's not, you can assume it's spelled wrong. If it's not in the file already, then you have to find the word that has the most similar spelling (starts with the same letters, etc), and has most of the same characters. This is not an easy task, and the last thing you want to have to do, is to try to remember string positions (where in the file was the space to the last word I was at?) and deal with the mess of parsing apart a huge file. I've attached a program that will take the word list you have, and put it into a file, 1 word per line. It's fully commented, so you can fiddle with it if you want to.
I would build the mini-program, just based on efficiency, and layout. It would be a lot cleaner in the spell check program (the code would be cleaner) by having each entry in the file on it's own line. Either way, you are going to have to test each word that the user has typed, against each word in the file.... if you want to load the entire file into memory, then I would do this (assuming the file is 1 long line):
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
open xfile for input as #1 line input #1, tmp words = split(tmp, " ") close #1
Let's face it, you're going to have your hands full just dealing with the spellchecker. You are going to have to see if the word is in the dictionary file already, then if it's not, you can assume it's spelled wrong. If it's not in the file already, then you have to find the word that has the most similar spelling (starts with the same letters, etc), and has most of the same characters. This is not an easy task, and the last thing you want to have to do, is to try to remember string positions (where in the file was the space to the last word I was at?) and deal with the mess of parsing apart a huge file. I've attached a program that will take the word list you have, and put it into a file, 1 word per line. It's fully commented, so you can fiddle with it if you want to.
Last edited by Comatose; Apr 26th, 2006 at 9:34 pm.
Right, I have a hard time thinking that loading the entire file into memory is a good idea... at least if each word is on it's own line, you can say:
Let me know what you come up with though.... it will be interesting to see how it turns out.
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
open xfile for input as #1 do until eof(1) line input #1, tmpvar if tmpvar = wordinquestion then ' /* Word is Good (in dictionary) */ end if loop close #1
![]() |
Similar Threads
- Virus scanner file(php) cannot be executed (PHP)
- cgi (HTML and CSS)
- help me wth file (C)
Other Threads in the Visual Basic 4 / 5 / 6 Forum
- Previous Thread: vb and msword
- Next Thread: Password protection
| Thread Tools | Search this Thread |
* 6 429 2007 access activex add age application basic beginner birth bmp calculator cd cells.find click client code college component connection connectionproblemusingvb6usingoledb copy creat ctrl+f data database datareport date delete dissertations dissertationthesis dissertationtopic edit error excel excelmacro file filename form hardware header iamthwee image inboxinvb internetfiledownload keypress label listbox listview liveperson login looping machine microsoft movingranges number objectinsert open oracle password prime program prompt range-objects readfile reading record refresh remotesqlserverdatabase report save search sendbyte sites sort sql sql2008 sqlserver subroutine tags textbox time urldownloadtofile vb vb6 vb6.0 vba visual visualbasic visualbasic6 web window windows






