Hey guys,

I have the following code, but for some reason it does not work with bigger txt files is this because of the size of the array?

And if so is there a simple way to solve the problem?

Thanks in advance for any help.

char ch[1000];
int v_word = 0;
int c_word = 0;
ifstream file("c:/test.txt");

   if (file.is_open()) 
     {
	while (file.getline(ch, 1000))
		if(strcmp(ch, "int") == 0)
		  v_word++;
		
     }
	
cout<<v_word<<endl;
cout<<c_word<<endl;
file.close();

it should work with text files of any size. Are any lines longer than 1,000 bytes? What is the purpose of checking if one of the lines = "int" ? Is there anything else following "int" on that line?

This is a c++ program -- why don't you use std::string instead of c-style arrays?

#include <string>
...
std::string line;
int v_word = 0;
int c_word = 0;
int spot;
ifstream file("c:/test.txt");

   if (file.is_open()) 
     {
	while (getline(fin,line) )
		if( (spot = line.find("int")) >= 0)
		  v_word++;
     }

Hi thanks for your reply, The purpose of the int was just so i could count how many times it appears in the file it my full version also searches for other words. Could this be why it doesn't work on larger files?
I will look into strings. :)

what do you mean by "it doesn't work"? what doesn't work? strcmp() compares the entire line with the text you put in literals, in your example it is "int". So if the line is "int some more stuff here", then strcmp() will fail. If you only want the first three characters, the use strncmp(), or if "int" can appear anywhere in the line use strstr(). Also you have to be careful with the comparison because strstr() will also find "intabc" is the same as "abcintabc" and " int ". So if strstr() finds the text "int" the program should check for space character immediately before and after. If there is no white space before and after, then strstr() did not find a word.

Hi thanks for your reply. I have tried to use strstr() but I keep getting the wrong word count.

For example all the txt file contains is "hello james" and it gives the count 0 for both words.

ifstream file("c:\\test.txt");

	if (file.is_open()) {
		while (file.getline(ch, 10000))
		{
			if(strstr(ch, "hello") == 0)
				v_word++;

			if(strstr(ch, "james") == 0)
				c_word++;
		}
	}
if(strstr(ch, "hello") == 0)

The return value of strstr is a pointer to the match or a NULL pointer when no match was found. Here you are checking for a NULL pointer, meaning when no match was found, so you would be incrementing your counter at the wrong time.

Try this to find at least one occurence per line.

if(strstr(ch, "hello") != 0)

Hi Dave, Thank you for your reply.

Is what you said also true for the strcmp() function?

Thanks.

That what I thought but when I was tesing it if I used the code:

if(strcmp(ch, "hello") == 0)

I got the wrong number of times the word occured.

but if I used the following code for strcmp()

if(strcmp(ch, "hello") != 0)

it works. But with strcmp() shouldn't it have worked with the first piece of code?

If the whole line is just "hello", then doing a strcmp against "hello" will be 0. Otherwise it won't be ("hello james" is not the same as "hello").

Even with strstr , you may want to move down the string after a match. Consider an input file like this:

hello james hello james hello hello hello james james hello james
#include <iostream>
#include <fstream>
#include <cstring>
using namespace std;

int main()
{
   char ch[1000];
   int v_word = 0;
   int c_word = 0;
   ifstream file("test.txt");

   if ( file.is_open() )
   {
      while ( file.getline(ch, 10000) )
      {
         char *match = ch;
         cout << "ch : " << ch << '\n';
         while ( (match = strstr(match, "hello")) != 0 )
         {
            cout << "<<hello>> " << match << '\n';
            v_word++;
            match += 5; /* the length of "hello" */
         }

         match = ch;
         while ( (match = strstr(match, "james")) != 0 )
         {
            cout << "<<james>> " << match << '\n';
            c_word++;
            match += 5; /* the length of "james" */
         }
      }
   }

   cout<<v_word<<endl;
   cout<<c_word<<endl;
   file.close();
   return 0;
}

/* my output
ch : hello james hello james hello hello hello james james hello james
<<hello>> hello james hello james hello hello hello james james hello james
<<hello>> hello james hello hello hello james james hello james
<<hello>> hello hello hello james james hello james
<<hello>> hello hello james james hello james
<<hello>> hello james james hello james
<<hello>> hello james
<<james>> james hello james hello hello hello james james hello james
<<james>> james hello hello hello james james hello james
<<james>> james james hello james
<<james>> james hello james
<<james>> james
6
5
*/

I am getting a better understanding of this now, thanks again for your help Dave. :)

Hey guys, I have currently used the following code.

if(strstr(ch,"hello") != 0)

and the file contains "hellojames" but it gives the hello word count as 1.

I was just wondering why this happens as shouldn't there be white space for this to happen?

Thanks.

Well, "hellojames" definitely contains "hello", so the function faithfully tells you it found it. If you want to look for "hello ", then you may do so.

read my post #4 verrrrrry sloooooooooooooowly and careeeeeeeefullllllllllly. I already covered that topic. pay close attention to what I said about strstr() and the necessity to check for whitespace both before and after the string you want to find.

This article has been dead for over six months. Start a new discussion instead.