Reply

Join Date: Sep 2005
Posts: 20
Reputation: robertlees is an unknown quantity at this point 
Solved Threads: 0
robertlees robertlees is offline Offline
Newbie Poster

Huge flat file

 
0
  #1
Apr 26th, 2006
I would like to create a spell check facility. I have a huge flat file of words. This file has no cr/lf. How should I do this ? I think I want to read this file say 500 characters at a time. I'll store this data in a variable, adding each 500 characters to what I have already got. Should the target variale be string ? Will that be big enough ? What else - isn't there a 'huge' or 'memo' type, or something like that ?
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: Huge flat file

 
0
  #2
Apr 26th, 2006
Bad idea. Your best bet is to work with the file line by line. If it has no cr/lf or newline, then how are the words separated, by space? If so, I would make a mini-program to actually add a vbnewline after every word, and re-save the file, 1 word per line. It just makes it easier to work with in the spellcheck program, but if you want to deal with it with no newlines, then I suggest reading in from the file, 1 character at a time. Check if the character is space (or whatever the words are delimited by) and then that's the first word....
Reply With Quote Quick reply to this message  
Join Date: Sep 2005
Posts: 20
Reputation: robertlees is an unknown quantity at this point 
Solved Threads: 0
robertlees robertlees is offline Offline
Newbie Poster

Re: Huge flat file

 
0
  #3
Apr 26th, 2006
Thanks for the quick reply. Yes, that makes sense. But how can I read the initial file to put a newline between each word. This assumes that I can read the file bit my bit. But if I can do this to create a newline delimited file, then I could just read in the initial file. Yes, the words are separated by spaces. If I do read the new file line/word by line/word, what should I do with the results ? Make it a huge string, or someother variable type ?
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: Huge flat file

 
0
  #4
Apr 26th, 2006
No No, do your test for whatever word you are checking it against, within the file. I mean, it depends entirely on how big the file is.... a .txt file would need an absurd amount of words in order to be too big to load into a variable. You have to take into consideration the type machine that this will be run on. My system is extremely well equipped, and I wouldn't worry about loading all or most of the file into an array and looping through it. However, if there is a machine that it will be run on, that's less than average, or even average, loading the whole file into a variable would be really slow, and probably slow the machine down a whole lot. It may even make the program appear to be frozen; these are all things you need to take into consideration.

I would build the mini-program, just based on efficiency, and layout. It would be a lot cleaner in the spell check program (the code would be cleaner) by having each entry in the file on it's own line. Either way, you are going to have to test each word that the user has typed, against each word in the file.... if you want to load the entire file into memory, then I would do this (assuming the file is 1 long line):
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
  1. open xfile for input as #1
  2. line input #1, tmp
  3. words = split(tmp, " ")
  4. close #1
And that will make an array (called words) that contains all the words in the file....

Let's face it, you're going to have your hands full just dealing with the spellchecker. You are going to have to see if the word is in the dictionary file already, then if it's not, you can assume it's spelled wrong. If it's not in the file already, then you have to find the word that has the most similar spelling (starts with the same letters, etc), and has most of the same characters. This is not an easy task, and the last thing you want to have to do, is to try to remember string positions (where in the file was the space to the last word I was at?) and deal with the mess of parsing apart a huge file. I've attached a program that will take the word list you have, and put it into a file, 1 word per line. It's fully commented, so you can fiddle with it if you want to.
Last edited by Comatose; Apr 26th, 2006 at 9:34 pm.
Attached Files
File Type: zip string2flat.zip (2.0 KB, 8 views)
Reply With Quote Quick reply to this message  
Join Date: Sep 2005
Posts: 20
Reputation: robertlees is an unknown quantity at this point 
Solved Threads: 0
robertlees robertlees is offline Offline
Newbie Poster

Re: Huge flat file

 
0
  #5
Apr 26th, 2006
Thank you for your detailed response. I didn't think I could just grab the entire file in one statement. If I can do that, I could just do the following...
bolFound = instr(strAllWords," " & strThisWord & " ")

I would be happy with not being able to find words that are close.
Reply With Quote Quick reply to this message  
Join Date: Dec 2004
Posts: 2,413
Reputation: Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough Comatose is a jewel in the rough 
Solved Threads: 211
Team Colleague
Comatose's Avatar
Comatose Comatose is offline Offline
Taboo Programmer

Re: Huge flat file

 
0
  #6
Apr 26th, 2006
Right, I have a hard time thinking that loading the entire file into memory is a good idea... at least if each word is on it's own line, you can say:
Visual Basic 4 / 5 / 6 Syntax (Toggle Plain Text)
  1. open xfile for input as #1
  2. do until eof(1)
  3. line input #1, tmpvar
  4. if tmpvar = wordinquestion then
  5. ' /* Word is Good (in dictionary) */
  6. end if
  7. loop
  8. close #1
Let me know what you come up with though.... it will be interesting to see how it turns out.
Reply With Quote Quick reply to this message  
Join Date: Sep 2005
Posts: 20
Reputation: robertlees is an unknown quantity at this point 
Solved Threads: 0
robertlees robertlees is offline Offline
Newbie Poster

Re: Huge flat file

 
0
  #7
Apr 26th, 2006
..I agree. Not sure what else I can do, but I'll work on it. I'll tell you if I come up with anything.

Regards
Robert
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the Visual Basic 4 / 5 / 6 Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC