954,498 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

File array question

Hey guys,

I have the following code, but for some reason it does not work with bigger txt files is this because of the size of the array?

And if so is there a simple way to solve the problem?

Thanks in advance for any help.

char ch[1000];
int v_word = 0;
int c_word = 0;
ifstream file("c:/test.txt");

   if (file.is_open()) 
     {
	while (file.getline(ch, 1000))
		if(strcmp(ch, "int") == 0)
		  v_word++;
		
     }
	
cout<<v_word<<endl;
cout<<c_word<<endl;
file.close();
Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

it should work with text files of any size. Are any lines longer than 1,000 bytes? What is the purpose of checking if one of the lines = "int" ? Is there anything else following "int" on that line?

This is a c++ program -- why don't you use std::string instead of c-style arrays?

#include <string>
...
std::string line;
int v_word = 0;
int c_word = 0;
int spot;
ifstream file("c:/test.txt");

   if (file.is_open()) 
     {
	while (getline(fin,line) )
		if( (spot = line.find("int")) >= 0)
		  v_word++;
     }
Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

Hi thanks for your reply, The purpose of the int was just so i could count how many times it appears in the file it my full version also searches for other words. Could this be why it doesn't work on larger files?
I will look into strings. :)

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

what do you mean by "it doesn't work"? what doesn't work? strcmp() compares the entire line with the text you put in literals, in your example it is "int". So if the line is "int some more stuff here", then strcmp() will fail. If you only want the first three characters, the use strncmp(), or if "int" can appear anywhere in the line use strstr(). Also you have to be careful with the comparison because strstr() will also find "intabc" is the same as "abcintabc" and " int ". So if strstr() finds the text "int" the program should check for space character immediately before and after. If there is no white space before and after, then strstr() did not find a word.

Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

Hi thanks for your reply. I have tried to use strstr() but I keep getting the wrong word count.

For example all the txt file contains is "hello james" and it gives the count 0 for both words.

ifstream file("c:\\test.txt");

	if (file.is_open()) {
		while (file.getline(ch, 10000))
		{
			if(strstr(ch, "hello") == 0)
				v_word++;

			if(strstr(ch, "james") == 0)
				c_word++;
		}
	}
Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 
if(strstr(ch, "hello") == 0)


The return value of strstr is a pointer to the match or a NULL pointer when no match was found. Here you are checking for a NULL pointer, meaning when no match was found, so you would be incrementing your counter at the wrong time.

Try this to find at least one occurence per line.

if(strstr(ch, "hello") != 0)
Dave Sinkula
long time no c
Team Colleague
5,058 posts since Apr 2004
Reputation Points: 2,780
Solved Threads: 314
 

Hi Dave, Thank you for your reply.

Is what you said also true for the strcmp() function?

Thanks.

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

strcmp() return 0 if the comparison succeed, or non-zero if it fails.

Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

That what I thought but when I was tesing it if I used the code:

if(strcmp(ch, "hello") == 0)


I got the wrong number of times the word occured.

but if I used the following code for strcmp()

if(strcmp(ch, "hello") != 0)


it works. But with strcmp() shouldn't it have worked with the first piece of code?

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

If the whole line is just "hello", then doing a strcmp against "hello" will be 0. Otherwise it won't be ("hello james" is not the same as "hello").

Even with strstr , you may want to move down the string after a match. Consider an input file like this:

hello james hello james hello hello hello james james hello james
#include <iostream>
#include <fstream>
#include <cstring>
using namespace std;

int main()
{
   char ch[1000];
   int v_word = 0;
   int c_word = 0;
   ifstream file("test.txt");

   if ( file.is_open() )
   {
      while ( file.getline(ch, 10000) )
      {
         char *match = ch;
         cout << "ch : " << ch << '\n';
         while ( (match = strstr(match, "hello")) != 0 )
         {
            cout << "<<hello>> " << match << '\n';
            v_word++;
            match += 5; /* the length of "hello" */
         }

         match = ch;
         while ( (match = strstr(match, "james")) != 0 )
         {
            cout << "<<james>> " << match << '\n';
            c_word++;
            match += 5; /* the length of "james" */
         }
      }
   }

   cout<<v_word<<endl;
   cout<<c_word<<endl;
   file.close();
   return 0;
}

/* my output
ch : hello james hello james hello hello hello james james hello james
<<hello>> hello james hello james hello hello hello james james hello james
<<hello>> hello james hello hello hello james james hello james
<<hello>> hello hello hello james james hello james
<<hello>> hello hello james james hello james
<<hello>> hello james james hello james
<<hello>> hello james
<<james>> james hello james hello hello hello james james hello james
<<james>> james hello hello hello james james hello james
<<james>> james james hello james
<<james>> james hello james
<<james>> james
6
5
*/
Dave Sinkula
long time no c
Team Colleague
5,058 posts since Apr 2004
Reputation Points: 2,780
Solved Threads: 314
 

I am getting a better understanding of this now, thanks again for your help Dave. :)

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

Hey guys, I have currently used the following code.

if(strstr(ch,"hello") != 0)


and the file contains "hellojames" but it gives the hello word count as 1.

I was just wondering why this happens as shouldn't there be white space for this to happen?

Thanks.

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

Well, "hellojames" definitely contains "hello", so the function faithfully tells you it found it. If you want to look for "hello ", then you may do so.

Dave Sinkula
long time no c
Team Colleague
5,058 posts since Apr 2004
Reputation Points: 2,780
Solved Threads: 314
 

read my post #4 verrrrrry sloooooooooooooowly and careeeeeeeefullllllllllly. I already covered that topic. pay close attention to what I said about strstr() and the necessity to check for whitespace both before and after the string you want to find.

Ancient Dragon
Retired & Loving It
Team Colleague
30,049 posts since Aug 2005
Reputation Points: 5,662
Solved Threads: 2,343
 

Thanks.

Jon182
Junior Poster in Training
91 posts since Jul 2005
Reputation Points: 10
Solved Threads: 0
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You