Hi,

I need to make a program that counts the frequency of characters in a file, then outputs how many times each character appears in it. I am trying to make a class for it, but I am having difficulties figuring out how to do the definitions for each method. I need some suggestions on how to get started.

Thank you.

#include <iostream>
#include <fstream>
#include <cctype>

using namespace std;

class charFrequency
{
public:
	char getCharacter();
	void setCharacter(char character);
	long getCount();
	void setCount(long count);
	void increment();

private:
	char character;
	long count;
};


int main()
{
	charFrequency myCount;						
	ifstream infile;			//input file stream variable
	ofstream outfile;			//output file stream variable

		// check for file
	infile.open ("textIn.txt");

	if (!infile)
	{
		cout << "Cannot open the input file." << endl;
		
		return 1;
	}

	outfile.open("textOut.txt");

	myCount.setCount(127);
	myCount.setCharacter('A');
	myCount.getCount();
	myCount.getCharacter();
	myCount.increment();





	system ("pause");
	return 0;
}


char charFrequency::getCharacter()
{
	return character;
}


long charFrequency::getCount()
{
	
}

void charFrequency::setCharacter(char character)
{
	
}

void charFrequency::setCount(long count)
{
	 
}

void charFrequency::increment()
{
	character++;

Not sure what your approach is, but you need to either change the data members in your class or you need to have more class object. Right now you have one object and it stores one character and one long. You need one character and one long for each ASCII character(128 of them), so you can either have a class that stores two arrays of size 128 or you can keep your class the same and have an array of 128 class objects. Regardless, you need to end up with 128 character variables and 128 long variables.

Let's say you have 128 class objects. Stick them in array:

charFrequency frequencies[128];

Initialize them to the proper letter and the proper count (0).

Now suppose you read in an 'a'. Increment the count:

char inputChar;
inputChar = infile.get();
frequencies[inputChar].increment();

Edited 6 Years Ago by VernonDozier: n/a

Comments
good stuff

Not sure what your approach is, but you need to either change the data members in your class or you need to have more class object. Right now you have one object and it stores one character and one long. You need one character and one long for each ASCII character(128 of them), so you can either have a class that stores two arrays of size 128 or you can keep your class the same and have an array of 128 class objects. Regardless, you need to end up with 128 character variables and 128 long variables.

Let's say you have 128 class objects. Stick them in array:

charFrequency frequencies[128];

Initialize them to the proper letter and the proper count (0).

Now suppose you read in an 'a'. Increment the count:

char inputChar;
inputChar = infile.get();
frequencies[inputChar].increment();

I will try that. Thank you

Yea your approach is definitely a little off so far. Do you want your program to count how many times a letter (that you specify) appears in a certain string? For example: How many times does the character 'p' appear in the string "apples"? And you want your class to return 2? If so, you should either make the constructor accept the search character and the string to be searched or make a method that accepts those as parameters and returns the count or a negative number if it is not found, for example.

If you can you should really use something like std::map for this. For example, doing something like this is a little better:

#include <map>
#include <iostream>
#include <cctype>
#include <fstream>


using namespace std;

class AsciiCharacterCount{
public:
	typedef std::map<char,long>::const_iterator constIterator;	
private:		
	std::map<char,long> asciiFreq;	
public:
	void update(char ch){	 
	 if(isalpha(ch)) asciiFreq[ch]++;
	}
	void update(const std::string& str){
		for(int i = 0; i != str.size(); ++i) update(str[i]);
	}
	long frequency(char ch)const{ 
		 constIterator itr = asciiFreq.find(ch);
		if(itr != asciiFreq.end()) return itr->second;
		else return 0;
	}
	constIterator begin()const{ return asciiFreq.begin(); }
	constIterator end()const{ return asciiFreq.end(); }	
};

int main(){
 AsciiCharacterCount freqCounter;
 freqCounter.update("Alalyze each of these letters");

 AsciiCharacterCount::constIterator itr = freqCounter.begin();
 while(itr != freqCounter.end()){
	 cout << "'" << itr->first << "'" << " frequency = " << itr->second << endl;
	 ++itr;
 }
 return 0;
}

Why should he use a std::map for this? Why is it a little better?

Ask yourself what his professor is getting the kid to achieve from this?

Sure a std::map would work but the beauty comes from achieving the same thing with much more rudimentary tools... like classes and arrays.

Huh? Map is meant to map one data to another. Thus mapping a character to its frequency is perfectly valid and better than using arrays. Its more flexible, safe,robust, and depending on situation would use less memory. I said the code is a little better, because it can be improved but is still better than using raw arrays.

>>Sure a std::map would work but the beauty comes from achieving the same thing with much more rudimentary tools... like classes and arrays

I beg to differ. I would rather use the right tool for the job then try to hack up things using the basic stuff.

LOL firstPerson you're at it again.

Stop and engage your grey matter before your EGO flares up and takes over.

I never said using a map was invalid...

I beg to differ. I would rather use the right tool for the job then try to hack up things using the basic stuff.

Yes YOU would, and YOU would probably be right. But this isn't about you.

Look at the OP's code. Look at his number of posts. Do you think he is advanced to understand your code. This is something a whizz kid would come up with after using c++ for a long time. All your code does is present itself as a cerebral mind numbing exercise for the OP. What do you think his professor is getting the student to achieve here? This task can be achieved simply using just arrays and procedural logic. A student must learn to walk before he can run. Comprende?

Edited 6 Years Ago by iamthwee: n/a

LOL firstPerson you're at it again.

Stop and engage your grey matter before your EGO flares up and takes over.

I never said using a map was invalid...

Yes YOU would, and YOU would probably be right. But this isn't about you.

Look at the OP's code. Look at his number of posts. Do you think he is advanced to understand your code. This is something a whizz kid would come up with after using c++ for a long time. All your code does is present itself as a cerebral mind numbing exercise for the OP. What do you think his professor is getting the student to achieve here? This task can be achieved simply using just arrays and procedural logic. A student must learn to walk before he can run. Comprende?

Hahaha, I don't speak spanish. Anyways, he did not mention anything about this being a h.w assigned by his professor. For all I know he might be doing this for practice, thus suggesting something better. And I don't think my code was hard to understand, in fact there shouldn't be any troubling understanding it unless one never seen std::map before.

>>I never said using a map was invalid...
Well...."Why should he use a std::map for this?". From that I got, "he shouldn't use map", which is a little variation of "using map here is bad". Maybe just misinterpretation.


Anyways, my code was just a suggestion. OP does not have to follow it. Thats why I asked OP if he could use std::map. I wasn't being forceful on him, just suggestive.

If the code does not help OP now, it will surely help others when they stumble upon this thread. Or at least make them think a little about their approach.

P.S: In general, the number of post one has does not directly relate to his/her intellect.

Edited 6 Years Ago by firstPerson: n/a

>> Well...."Why should he use a std::map for this?". From that I got, "he shouldn't use map", which is a little variation of "using map here is bad". Maybe just misinterpretation.


IMO using a map here IS bad, even if one is proficient with maps. Why would I want to use a map to get from a character to a long when I can use a plain old array index?

long counts[128] = { 0 };

string sentence = "I love pizza.";
for (int i = 0; i < sentence.length(); i++)
{
    counts[sentence[i]]++;
}

char aChar;
cout << "Enter a character : ";
cin >> aChar;
cout << aChar << " occurs " << counts[aChar] << " times.\n";

No maps, no iterators, and in fact, no classes. Why complicate things? I assume the OP is learning classes, which is why he has one. I don't think one is necessary. Anyway, a map would work just fine, but again, why not just find the count directly from the array index?

Edited 6 Years Ago by VernonDozier: typo

>>why not just find the count directly from the array index

I guess I was thinking in terms of "Design for change" principle. But one reason
I can give you is to save memory. For example the set {a...z} has only 26 characters,
so your array length would be at least of 26 length. So there is some space wasted there if one doesn't use up all the characters. But by using map, its length will be only the number of different characters analyzed.

So with that in mind, I realize that an array of length 26 isn't much these days. But
as I said, I was thinking of design for change, thus if one wanted to count the
frequency of say numbers instead of characters, then its a different ball game.
Because depending on the max number allowed, the array can waste a lot more memory
than map. The same goes for counting frequency for Unicode letters. And what happens if the standard decides to change the "int" value of a character? There could and couldn't be a bug if that happens.

Another reason I can think of is that using arrays, it is more error prone, as one can easily reach out of bounds exception. What happens if the user reads in say a weird value? It could lead to a bug or not. But for its simplicity, and guessing that OP is still learning, its probably an ok design to use arrays, as he probably doesn't have to worry about future changes.

>> But by using map, its length will be only the number of different characters analyzed.

I don't know much about the internal storage of maps, but there's gotta be some overhead (number of elements, pointers, etc., plus you have to store both the key and the value as opposed to just the value). So if you have a big text file, or even a fairly small one that uses the majority of the letters/characters, but not all, I imagine you're using more memory in a map than an array. If you can't spare space for 26 or 128 longs, you're probably on an embedded system and not able to use C++ at all anyway.

>> thus if one wanted to count the frequency of say numbers instead of characters, then its a different ball game.

That indeed would be a very different ballgame.

>> The same goes for counting frequency for Unicode letters.

Also a different ballgame. I'm assuming ASCII or at least Unicode with only the ASCII characters. Valid assumption? Only the OP knows. I know very little about Unicode. I thought they used wide characters, not characters anyway. Could be wrong. Guess I need to get less-ASCII-centric and stop making these assumptions. :)


>> And what happens if the standard decides to change the "int" value of a character?

Such a different ballgame that it won't be C++ anymore. They aren't going to change this. Characters are going to be a one byte integral type. I can't imagine how they could possibly ever change this. The number of programs that would have to be rewritten (just about every program ever written) would be staggering.


>> Another reason I can think of is that using arrays, it is more error prone, as one can easily reach out of bounds exception. What happens if the user reads in say a weird value?

If it's an ASCII text file, there are no weird values. Values will be from 0 to 127, inclusive. If you want to handle weird values, you can either do a little error checking or make it an array of 256 to handle any weird values. But I guess a byte need not be 8 bits, so a "weird" value could still overflow, so error checking it is.

Anyway, assuming it's a fairly short input file, using a map wouldn't have too bad of a performance disadvantage. If you had a BIG file, I imagine that using a map would cause the program to be far slower.


To the OP, hopefully you don't feel this is a thread hijack. I try not to do that, but sometimes threads take on a life of their own. Feel free to chime in again and get it back on track. ;)

Edited 6 Years Ago by VernonDozier: n/a

Thanks for the interesting discussions. This is for a Data Structure class, and we basically have to do the same program with different data structures. So now I have to use this class with a linked list as a structure. And I took my C++ class months ago, and my mistake was that I haven't touched C++ since then. But I appreciate the posts, I learned something new.

Thanks

>>So if you have a big text file, or even a fairly small one that uses the majority of the letters/characters, but not all, I imagine you're using more memory in a map than an array.

The thing is, it would not. Imagine you wanted to count the frequency of some large input where the input has a range of values from say 0-2^32. Using arrays, the arrays size would have to be at least 2^32. But using map it might not be that case, that is using maps, it would only use as much as memory needed. For example for the arrays you HAVE to make its size >= 2^32 because you need to take into consideration for possibility of each elements being read. But using map you only worry about the things you read, and not worry about the things you might read.

Edited 6 Years Ago by firstPerson: n/a

>> Imagine you wanted to count the frequency of some large input where the input has a range of values from say 0-2^32.

Right. But since we're dealing with characters, that's not the input range. ASCII's range is 0 to 127. If you're using something else, as mentioned, it's a whole different ballgame.

>> Imagine you wanted to count the frequency of some large input where the input has a range of values from say 0-2^32.

Right. But since we're dealing with characters, that's not the input range. ASCII's range is 0 to 127. If you're using something else, as mentioned, it's a whole different ballgame.

Cool cool, then we're on the same page. To justify myself again, I was thinking in terms of design for change, hence the usage of map.

This question has already been answered. Start a new discussion instead.