>>I managed to read the whole file in one string
No you didn't. fgets() will not read the entire file into a single character array because fgets() stops reading when it encounters the first '\n' -- which is the character that terminates a line. So all your program will do is read the first word in the file.
>>Is there a way too keep this dictionary in a variable in C
Use a linked list of words. AFAIK there is no limit to the number of words that can be store in memory that way, except of course the amount of available RAM and hard drive swap space. Hopefully, you are not foolish enough to use a 16-bit compiler such as Turbo C.
Ancient Dragon
Achieved Level 70
32,275 posts since Aug 2005
Reputation Points: 5,852
Solved Threads: 2,591
Skill Endorsements: 70
Hmm... keeping the dictionary in an external file seems like the best approach to me. Correct me if I'm wrong.
Regarding your first idea, it should work with a long int . Correct me if I'm wrong again.
creeps
Junior Poster in Training
82 posts since Jul 2010
Reputation Points: 85
Solved Threads: 8
Skill Endorsements: 0
>Hmm... keeping the dictionary in an external file seems like the best approach to me. Correct me if I'm wrong.
How do you suggest looking up a word? Search the file each time? Unless it's a properly designed external database, that's going to be very inefficient. The words will likely all fit in memory, so an internal data structure makes more sense here.
>But to solve this problem he can have all the strings on one line.
Once again, looking up a word is tedious and potentially inefficient. You'd have to parse out the words every time, or store them in a separate data structure. But if you're using a separate data structure anyway, just use that as the primary storage medium.
>Use a linked list of words.
I'd use a balanced binary search tree or a chained hash table. Since this is a dictionary, we can expect lookup to be the primary operation. Searching a linked list of 50k+ words may not be the best approach. Of course, you can try amortizing the performance with tricks like moving the most recent match to the front, but I think a data structure better suited to searching is the superior option.
Narue
Bad Cop
15,460 posts since Sep 2004
Reputation Points: 6,483
Solved Threads: 1,408
Skill Endorsements: 55
I just tried this for fun in Turbo C, using an array of char pointers:
char *names[SIZE];
then getting the length of each name, and malloc'ing the memory for it, and putting it into the names[i++] position.
Using short names, TC was limited to less than 5k names (4,200 average).
Which means if the OP is trying to do this with TC, no matter what data structure you choose, if it's internal, you won't be successful.
Which bring me back to Creeps suggestion of simply keeping the names on a HD.
Although HD's can keep a lot of data in their cache, a better idea would be to use a virtual drive, in memory. Like a RAM disk file. I used one of these last year to avoid disk trashing while running a big project, and it worked out very well.
It is a shame to keep using a 16 bit compiler for work like this, when you have a 32 or 64 bit OS, with Gigs of RAM available, however.
Adak
Posting Virtuoso
1,641 posts since Jun 2008
Reputation Points: 456
Solved Threads: 196
Skill Endorsements: 7
@Adak: I'm pretty sure you probably know this, but for those who don't: The reason it failed is that Turbo C is a 16-bit compiler and the programs is produces is limited to 640 Meg RAM, mimus the amount needed for the operating system and other drivers. It actually winds up to somewhere between 450-540 meg.
Ancient Dragon
Achieved Level 70
32,275 posts since Aug 2005
Reputation Points: 5,852
Solved Threads: 2,591
Skill Endorsements: 70
Oh, I know it well.
I put that up to help the OP see that he can't store 54k names in ANY internal data structure, if he's using turbo C. Not gonna happen. ;)
That leaves the viable options external storage, only.
Adak
Posting Virtuoso
1,641 posts since Jun 2008
Reputation Points: 456
Solved Threads: 196
Skill Endorsements: 7
@Adak: I'm pretty sure you probably know this, but for those who don't: The reason it failed is that Turbo C is a 16-bit compiler and the programs is produces is limited to 640 Meg RAM, mimus the amount needed for the operating system and other drivers. It actually winds up to somewhere between 450-540 meg.
AD, You meant kilobytes right?
mvmalderen
Posting Maven
2,612 posts since Feb 2009
Reputation Points: 2,221
Solved Threads: 281
Skill Endorsements: 36
We both did.
After I fire up the Turbo C IDE, I show just 403K available - yeeeee! ;)
Adak
Posting Virtuoso
1,641 posts since Jun 2008
Reputation Points: 456
Solved Threads: 196
Skill Endorsements: 7
If your Borland is like my Turbo C from Borland, then you can't do it. I don't care how many data structs you use. You don't get more memory because you use more data structures. ;)
Global memory comes off the stack, generally. That is limited. Try the heap, it's bigger. (malloc uses the heap for it's memory source, not the stack)
The way to go with it, is to use a newer compiler. MS Visual Express, Pelles C, gcc, Code::Blocks with MingGW, are all compiler's that will allow you to enjoy larger memory access.
I love Turbo C, but for large amounts of memory - phffffftttt! :(
This is the program I wrote to test it:
/* Tries to load all 54,000 names from the names54k.txt file, into an
array of pointers, where the memory for each name, is malloc'd.
This was done on Turbo C, ver.1.01
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define SIZE 4000
/* SIZE will vary depending on the length of the words
that are being saved, from the file.
*/
int main() {
unsigned int i, j, n;
int len;
char *pstr[SIZE];
char buff[40]="";
FILE *fp;
printf("\n\n");
fp=fopen("names54k.txt", "rt");
if(fp==NULL) {
printf("\nError opening names file");
return 1;
}
i=0;
while(fgets(buff, sizeof(buff), fp)) {
len=strlen(buff);
if(buff[len - 1]=='\n')
buff[len-1]='\0';
if((pstr[i]=malloc(len))==NULL) {
printf("\nError allocating memory: i==%u", i);
return 1;
}
for(j=0;j<len;j++)
*(pstr[i]+j)=buff[j];
buff[0]='\0';
++i;
if(i>SIZE) break;
printf("\n%s", pstr[j]);
}
//for(j=0;j<20;j++) printf("\n%s", pstr[j]);
fclose(fp);
for(j=0;j<=i;j++)
free(pstr[j]);
printf("\n\n\t\t\t press enter when ready");
i = getchar(); ++i;
return 0;
}
Adak
Posting Virtuoso
1,641 posts since Jun 2008
Reputation Points: 456
Solved Threads: 196
Skill Endorsements: 7