Since you have no qualms about reading the file multiple times, just go through it once more character-by-character to grab the whitespace and non-whitespace counts. Then you'd have three loops:Read line-by-line and increment the line count
Read word-by-word and increment the word count
Read character-by-character and increment the space/non-space count
Though the typical approach to this is reading the file character by character and using a state machine to help out with the trickier blocks (ie. words). This way you only read through the file one time.
>while (!file.eof())
This is a bug waiting to happen. eof() only returns true after you've tried and failed to read from the stream, which means the last record will be processed twice. I strongly recommend using your input method as the condition:
while (getline(file, s1)) {
while (file >> s1)) {
Both getline and operator>> return a reference to the stream object, which has a conversion path to bool for checking if the state is good or not.>ch_spaces += s1.size();
This doesn't do what you think. The length of s1 includes both whitespace and non-whitespace. You need to further break it down.
Narue
Bad Cop
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
There are several ways. Since the space char is a character, notated like this: ' ', you can look for it directly. Otherwise you can use a function related to isalpha() called isspace(). However, isspace() looks for all whitespace characters, which includes the space character as well as the tab character, the newline char, etc.
Lerner
Nearly a Posting Maven
2,382 posts since Jul 2005
Reputation Points: 739
Solved Threads: 396
Once you have a working version of the code, I'll also post an alternative using my suggestion.
Narue
Bad Cop
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401
I have used your suggested method of controlling the while loop in this version of my program but actually I have not understand what is the difference between the two methods.
Your original method is a bug and my method corrects the bug. The bug is that on the very last iteration of the loop, eof() will return false even though there's nothing left to read. Then your input method will fail, which leaves the input variable with the same contents as the previous iteration. So you process the same input two times.I think you misunderstood my use of this statement
Yes, I misunderstood your use of ch_nspaces. I read it as the number of whitespace characters rather than the number of total characters minus whitespace.
As promised, here is my version:
#include <cctype>
#include <fstream>
#include <iostream>
int main()
{
std::ifstream in("test.txt");
if (in) {
int total = 0, spaces = 0, words = 0, lines = 0;
bool inword = false;
char ch;
while (in.get(ch)) {
++total; // All characters including whitespace
if (ch == '\n')
++lines;
if (std::isspace(ch)) {
++spaces;
inword = false;
}
else if (!inword) {
++words;
inword = true;
}
}
// Add the last line if it doesn't end with '\n'
if (ch != '\n' && total > 0)
++lines;
std::cout<<"Total characters ("<< total <<")\n"
<<"Non-spaces ("<< total - spaces <<")\n"
<<"Words ("<< words <<")\n"
<<"Lines ("<< lines <<")\n";
}
}
Narue
Bad Cop
15,460 posts since Sep 2004
Reputation Points: 6,464
Solved Threads: 1,401