I need save each word of text file into array of strings
I have this code how to do for convert into dinamically

 #include <stdio.h>
  #include <conio.h> 
 #define MAX_CHARS 20

 typedef char string[MAX_CHARS+1];  // leave one space for '\0'

main(){
int i;
string array[4];

 FILE *data;
 data = fopen("ard.txt","r");
 for(i = 0; i < 4; i++)
    { fscanf(data, "%s", array[i]); } // no need for & with %s
       printf("%s",array[3]);
getch();
 }

Recommended Answers

All 12 Replies

I note that you have both C and C++ tags on this. I assume you are using a C compiler. If C++, then consider using a vector of C++-style strings. vector and C++-style strings are both dynamic, but you are not responsible for keeping track of the memory.

Assuming this is in fact C, malloc and realloc from stdlib.h are what you need. Something like this, expanding on your code.

 #include <stdio.h>
 #include <stdlib.h>
 #define MAX_CHARS 20

typedef char string[MAX_CHARS+1];  // leave one space for '\0'

int main()
{
   int i, num_strings = 0, array_size = 4;
   FILE *data;
   string* array = (string*) malloc(array_size * sizeof(string));
   if(array == NULL)
   {
       return 1; // error allocating memory
   }

   data = fopen("ard.txt","r");
   while(fscanf(data, "%s", array[num_strings]) == 1)
   {
       num_strings++;
       if(num_strings >= array_size) 
       {
           array_size *= 2;
           array = (string*) realloc(array, array_size * sizeof(string));
           if(array == NULL)
           {
               return 1; // error allocating memory
           }
       }           
   }
   for(i = 0; i < num_strings; i++)
   {
       printf("%s\n", array[i]);
   }
   free(array); // dispose of the memory reserved for array since we don't need it anymore
   getchar();
   return 0;
}

Note that I got rid of getch() from conio.h and replaced it with getchar(). conio.h is rather outdated.

hey man because this is what I needed I appreciate it very much, but I need a brief explanation of the code

The array_size is the number of elements of the array, which I made 4 to match your code. Each element of the array is 21 bytes (20 + 1), as with your code. So total array_size is 4 times 21 = 84 bytes, as it was in your code, after line 11 executes. The only difference is that with malloc, it was 84 bytes of dynamic memory whereas it was 84 bytes of static memory in your code. Or I'll say AT LEAST 84 bytes. Very often, the malloc function will provide more memory than you ask for. If something goes wrong with the malloc command in line 11 and the Operating System refuses to grant us at least 84 bytes of memory, the function will return NULL. That's our cue to abort the program with a failure code (i.e. the "return 1" instead of returning 0 in line 13). If everything goes well, the Operating System grants us at least 84 bytes of memory and returns the address, which we cast to to string* because that's the type that the memory points to. We now have room to store 4 words, as in your program, so it's the same so far. Note that there's no checking done in the fscanf lines to confirm that the word being read into the array element will fit in a 21 byte buffer. I kept that as-is because that's the way it is in your original code and your question did not relate to that. Consider using fgets to replace fscanf so you can specify a buffer length if necessary.

http://www.cplusplus.com/reference/cstdio/fgets/

OK, back to the code. So I replaced your for-loop with a while loop that breaks out of the loop when there are no more words to read (i.e. fscanf does not return 1). Again, to match your original code, I made the original array size the same as yours (4). So the first four trips through the loop result in the words being read into array[0], array[1], array[2], array[3], just like your code. num_strings keeps track of how many words get read in. The fifth word needs to be read into array[4], which is beyond the guaranteed length of our buffer.

So in line 21, when num_strings is 4, it is compared to the array length, which is also 4. To read in another word, we need more space. I doubled the storage request in line 23, but you can triple it or whatever you want, as long as you request more space. Most of the examples I have seen double the space requested in cases like this, so that's what I did. After line 23 executes, array_size is now 8, so I am requesting 8 times 21 bytes, or 168 byes total for my new array length.

I call realloc to request that space. The OS will first try to extend my storage from 84 bytes to 168 bytes. If it is able to do that, it will and it will not create a new buffer. It will return a pointer to the new buffer, which is now guaranteed to be at least 168 bytes, which I assign to the array variable. If it was NOT able to extend my original buffer, it gives me a NEW buffer and returns a pointer to that new buffer. It will copy the contents of the old buffer to the new buffer, so regardless of how it does it, array now points to a buffer that is at least big enough to hold 8 words (168 bytes total) and contains the words already read in. If for whatever reason, the OS was unable to do that, it will return a NULL pointer. We check for that, and if array points to NULL, we abort with an error code.

array_size is now 8, so we read the next four words into the array, so num_strings is now 8, the same as array_size. We have run out of room in this array, so we try to double the size again with realloc. Then we read in the next 8 words. Then double the array size again with realloc. Etcetera till we are done with the file, or if it is a huge file, run out of memory and abort.

Generally behind the scenes, malloc and realloc will allocate to some multiple of whatever the easiest size to allocate is (i.e. 512 bytes, 4096 bytes or whatever). You don't need to care about any of that since malloc and realloc handle all that behind the scenes for you.

And as mentioned, if you are using C++ instead of C, C++'s vector will handle all that for you. In C, you are responsible for that yourself.

When you are done with the memory, it's always good to "free" the memory so the Operating System can use it elsewhere, so that's line 35.

well, I have another question
Now I need to know if the strings in the array are numerical or words but when testing with isdigit () and isalpha()
I do not return the desired value because it will be

             for(i=0;i<num_strings;i++)
             {
            if(isdigit(array[i]))
        {
         printf("%s number\n",array[i]);
         }
     else  if(isalpha(array[i]))
        {
         printf("%s number\n",array[i]);
         }
         }

my txt file is it:

           jose     fernado jose
        1234
        leonardo
       gonzalez    <=   <

        777     ,isl        ,   .    {   }

the output is it, but is wrong because the symbol " , " is not a number

, number <----------------this is the output

i like this

jose .............word
gonzalez ............word
1234................number

For this thread, I only commented on how to do what you asked (i.e. read the file into a dynamic array of strings delimited by white space (tabs, spaces, newlines)), which is what fscanf with "%s" does.

As to your overall project, I posted in your previous thread the need to get the algorithm/process/strategy and "rules" established first, before coding. I believe that the strategy of reading everything from the rule into an array of strings delimited by white space before doing any parsing will probably not work for your project. I'll reiterate my point from your last thread. The coding is generally done LAST. It doesn't appear that you have picked an algorithm yet.

What WILL work for your project is something that I DON'T know because I don't know the exact spec of your project, I don't know what assumptions you need to make, and I don't know your coding and algorithm experience. Generally when creating a lexer/parser, one reads data in a character at a time and one has to make a decision whether there is an error and whether the character is part of a new token or is a continuation of the last token based on the spec. Depending on the "language", very often a stack is involved. But not always.

No one can tell you how to parse the input file you list because no one but you knows what the rule for what a "word" is versus a "number" versus an "identifier" versus a "keyword" or an "operator" or what an error is. You have C and C++ tags on this thread, but the input file clearly will not parse as a C or a C++ program, so I can't assume anything about the language to be parsed is or what a legal "word" or a legal "number" is or what might delimit the different tokens. "Number" is a bit ambiguous. "unsigned int" and "int" and "double" will all have different parsing rules, which you have not provided. And I still don't know whether you are WRITING this in C or C++. C++ comes with vectors and stacks and strings and a bunch of stuff that C doesn't have that makes a lot of this much easier. Use them if you can.

You also are using isdigit and isalpha on string parameters. They are designed to be used on character parameters.

As for your problem regarding displaying a non-number as a number, note that lines 5 and 9 are identical. "number" is outputted when the if statement on line 7 returns true.

nope , it does not work i think is because is a string and isdigit read a int so how to do this???????? verify in all array of strings I need help to complete my homework

how to do this????????

I'll repeat. No one can tell you how to do "this" because no one but you knows what "this" is. By "this", I mean EXACTLY what are the rules for what a "number" is, what are the rules for what a "word" is, etc., etc.? I've spelled out what you need to provide to get help in previous posts in this and other threads. The fact is that even if I wanted to do the whole project for you, I could not because I don't know the project spec. For example, you give an input file with letters, digits, commas, periods, less-than signs, an equals sign and brackets. What EXACTLY is the output supposed to be for this? How about abcd5? Is that a "word"? How about -12.4? Is that a "number"? How about "12,34"? Is that a single token that is neither a number or a word or is that three different tokens, the first "12", a number, then the comma, which is an operator, then "34", which is another number? Etcetera. You still have not answered whether you are writing this in C or C++.

At any rate, regarding how to use isdigit and isalpha correctly, if a "number" is anything that is all digits and a "word" is anything that is all letters and anything else is neither, you need to iterate through the entire string, character by character, as below...

 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
 #include <ctype.h>
 #define MAX_CHARS 20

typedef char string[MAX_CHARS+1];  // leave one space for '\0'

int main()
{
   int i, num_strings = 0, array_size = 4;
   FILE *data;
   string* array = (string*) malloc(array_size * sizeof(string));
   if(array == NULL)
   {
       return 1; // error allocating memory
   }

   data = fopen("ard.txt","r");
   while(fscanf(data, "%s", array[num_strings]) == 1)
   {
       num_strings++;
       if(num_strings >= array_size) 
       {
           array_size *= 2;
           array = (string*) realloc(array, array_size * sizeof(string));
           if(array == NULL)
           {
               return 1; // error allocating memory
           }
       }           
   }
   for(i = 0; i < num_strings; i++)
   {
       bool isNumber = false, isWord = false;
       int j, stringLength = strlen(array[i]);
       printf("%s ------- ", array[i]);
       if(isdigit(array[i][0]))
       {
           isNumber = true;
           for(j = 1; j < stringLength; j++)
           {
               if(!isdigit(array[i][j]))
               {
                   isNumber = false;
                   break;
               }
           }
       }
       else if(isalpha(array[i][0]))
       {
           isWord = true;
           for(j = 1; j < stringLength; j++)
           {
               if(!isalpha(array[i][j]))
               {
                   isWord = false;
                   break;
               }
           }
       }
       if(isNumber)
           printf("number\n");
       else if(isWord)
           printf("word\n");
       else
           printf("neither number nor word\n");
   }
   free(array); // dispose of the memory reserved for array since we don't need it anymore
   getchar();
   return 0;
}

hey again , I excuse me think now if I 'm on the right path and I'm doing a scanner from a file but it turns out doing some tests when I locate the single character quotes ( ") the pointer to use to move the string containing the file locates the character to the right of the quotation mark and not the quote I'm looking for

 char c;
 int state=0;
 int main()

  {
   char text[1000];
 char *a=text;
 FILE *fp=fopen("ard.txt", "r");
int i=0;
while(!feof(fp)){
 text[i++] = fgetc(fp);}
text[i]='\0';

 while(*a!='\0')
 {
switch(state)
{
 case 0: if(*a=='"') state=1;
 a++;
 break;

case 1:

    printf("%c comilla\n",*a);
    state=0;
    break;
 }
 }
}

Looks like you are incrementing a regardless of whether *a is a comma. If *a is a comma and you want it to stay as a comma, don't increment a after you find the comma. Try an if-else...

 case 0: if(*a=='"') state=1;
 else a++;
 break;

hey again , and finish my scanner but a small mistake, so far I identify well everything, but I need to leave the token followed by your name
for example number 345 -----
I get so far only number that is well numeos identifies with decimal point and integers, but when I want to show the 345 not only shows me the number which is a number

#include<stdlib.h>
#include<stdio.h>
#include<ctype.h>
#include<conio.h>
#include<string.h>

void main()
{
int estado=0,x=0;
char entrada[1000];
 char iden[20];
char *p =entrada;
int i=0;
int state=0;
char *boid = "VOID";
char *prin= "PRINTF";
char *scan= "SCANF";

FILE *fp=fopen("ard.txt", "r");
while(!feof(fp)){
     entrada[i++] = fgetc(fp);}
entrada[i]='\0';

while(*p!='\0')
{
    switch(estado)
    {

case 0:
        if(*p==' '||*p=='\t'||*p=='\n')
            {
            estado=0;
            }
        else if(isalpha(*p)){
iden[x]=toupper(*p);
x++;
estado=1;}
else if(*p=='{') estado=3;
else if (*p=='}') estado=4;
else if (*p=='"') estado=5;
else if(*p=='=') estado=6;
else if(*p=='(') estado=7;
else if (*p==')') estado=8;
else if (*p=='*') estado=9;
else if (*p=='/') estado=10;
else if (*p=='+' || *p=='-') estado=11;
else if (isdigit(*p)) estado=12;
else if (*p==' ') estado=21;
else estado=22;
p++;
break;
case 1: if(isalpha(*p)) {
iden[x]=toupper(*p);
x++;
estado=1; }

else {estado=2; iden[x]='\0'; }
p++;
break;
case 2: if(strcmp(iden,boid)==0) printf(" reserved ");
else if(strcmp(iden,prin)==0) printf(" reserved ");
else if(strcmp(iden,scan)==0) printf(" reserved ");
else printf(" IDEN\n",*p);
x=0;
estado=0; //p--;
break;
case 3: printf(" LI\n");
estado=0; //p--;
break;
case 4: printf(" LLD\n");
estado=0; //p--;
break;
case 5: while (*p!='"')
{p++;}
p++;
printf(" CADENA\n");
estado=0; //p--;
break;
case 6: printf(" IGU\n");
estado=0; //p--;
break;
case 7: printf(" PIZQ\n");
estado=0; //p--;
break;
case 8: printf(" PDER\n");
estado=0; //p--;
break;
case 9: printf(" MULT\n");
estado=0; //p--;
break;
case 10: printf(" DIV\n");
estado=0; //p--;
break;
case 11: if(isdigit(*p)) estado=12;
else
{printf(" FALTA DIGITO\n");
exit(1);}
p++;
break;
case 12: if(isdigit(*p)) estado=12;
else if(*p=='.') estado = 13;
else if(*p=='E' || *p=='e') estado= 15;
else estado= 19;
p++;
break;
case 13: if(isdigit(*p)) estado=14;
else
{printf(" FALTA DIGITO");
//p++;
exit(1);}
p++;
break;
case 14: if (isdigit(*p)) estado=14;
else if(*p=='E' || *p=='e') estado= 15;
else estado=20;
p++;
break;
case 15: if (*p=='+' || *p=='-') estado=16;
else if(isdigit(*p)) estado=17;
else {printf(" FALTA DIGITO");
exit(1);}
p++;
break;
case 16: if(isdigit(*p)) estado=17;
else {printf(" FALTA DIGITO");
exit(1);}
p++;
break;
case 17: if(isdigit(*p)) estado=17;
else estado=18;
p++;
break;
case 18: printf(" NUMEXP\n");
estado=0; p--;
break;
case 19: printf(" ENT\n");
estado=0; p--;
break;
case 20: printf(" DECI\n");
estado=0; p--;
break;
case 21: printf(" ");
estado=0; //p--;
break;
case 22: printf("ERROR LEXICO\n");
estado=0; p--;
break;
}//switch

}//while

switch(estado)
{
case 1:
case 2: printf(" IDEN"); break;
case 3: printf(" LLI"); break;
case 4: printf(" LLD"); break;
case 5:printf(" CADENA"); break;
case 6: printf(" IGU"); break;
case 7: printf(" PIZ"); break;
case 8: printf(" PDE"); break;
case 9: printf(" MUL"); break;
case 10: printf(" DIV"); break;
case 11:
case 12: printf(" ENT"); break;
case 13:
case 14: printf(" DECI"); break;
case 15:
case 16:
case 17:
case 18: printf(" NUMEXP"); break;
case 19: printf(" ENT"); break;
case 20: printf(" DECI"); break;
}

getch();
}

my test file at the moment is this

using namespace std;

int main() {

    cout  "Hola Mundo"  endl

    return 0

}

my output is ENT
Deci

that is my exit if it detects that it is every word but I need to have that detected word followed by its identifier

it's just put the token file that reads your identfier pair and try to place the point * p in the printf with his identified but does not show me

I need it :

     444 ----------number

I'm seeing a bad printf statement on line 63. I see a *p argument passe to printf, but I don't see anything inside the quotes that would take that argument. I'm guessing you want printf("%s ----- ", iden) as the first statement in case 2 on line 60. That will display the token for the tokens which are all letters. That might solve some of your problems, but not all. You have a state machine and go character by character, which is good, but I think you need to revisit your overall algorithm. I don't see a quick fix for this code to make it do what you want.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.