This is a conceptual question to help with a program I am developing.
It takes two lists from a file and compares them in parallel to see if they are anagrams.
My question is how do I store these strings in a one dimensional array mapped to ASCII codes. I don't grasp mapping if each individual string is in the array how are the individual letters compared.
would the array took like this and only be filled with each word at time of comparison. I have ten pair to compare

o l
w o
l w
How does it know what ASCII number to map to in the 26 address array? I hope this is clear I can answer any questions that will help. I was going to have a 0-25 array but don't know how to relate each word to ASCII.
Thanks for any ideas

Here's how I would go about it...

string word1 = "orchestra", word2 = "carthorse";

	set<char> s;

	for(int i = 0; i < word1.size(); i++){
		s.insert(word1[i]);
	}

	bool isAnagram = true;

	for(int i = 0; i < word2.size() && isAnagram; i++){
		if(s.find(word2[i]) == s.end())
			isAnagram = false;
	}

The set container in this example will not allow duplicate entries. This is not completely necessary but it's a little bit more efficient. It also uses a binary tree structure for faster searching which is great if you have alot of word pairs to check.

Edited 6 Years Ago by Sumyungi: n/a

On the other hand I just had another idea which might be even faster...

string word1 = "orchestra", word2 = "carthorse";

	vector<bool> v(false,26);

	for(int i = 0; i < word1.size(); i++){
		v[word1[i] - 'a'] = true;
	}

	bool isAnagram = true;

	for(int i = 0; i < word2.size() && isAnagram; i++){
		isAnagram = v[word2[i] - 'a'];
	}

The code above assumes that each word will be restricted to lowercase letters.

And I just realized that I hadn't really answered your question ( I blame it on the fact that I have not yet had my full daily dose of caffeine).

It seems like the second example would be more like the scenario you were trying to describe in your post. If you notice, each element in the vector is being accessed using the index operator and an expression like: word1 - 'a'.

The reason why this works is that each character (char data type) is really a unique one-byte integer value which your computer uses to look up the appropriate symbol in its ASCII table.

Thankfully, alpha character values occur consecutively and in order (however this does not hold between upper case and lower case ), so that the difference between 'b' and 'a' is only 1. Hence, any lower-case character minus 'a' will equal its distance from 'a' within the alphabet sequence, and this makes a great indexing strategy for a 26 element array where each element represents a different letter.

Edited 6 Years Ago by Sumyungi: n/a

And I just realized that I hadn't really answered your question ( I blame it on the fact that I have not yet had my full daily dose of caffeine).

It seems like the second example would be more like the scenario you were trying to describe in your post. If you notice, each element in the vector is being accessed using the index operator and an expression like: word1 - 'a'.

The reason why this works is that each character (char data type) is really a unique one-byte integer value which your computer uses to look up the appropriate symbol in its ASCII table.

Thankfully, alpha character values occur consecutively and in order (however this does not hold between upper case and lower case ), so that the difference between 'b' and 'a' is only 1. Hence, any lower-case character minus 'a' will equal its distance from 'a' within the alphabet sequence, and this makes a great indexing strategy for a 26 element array where each element represents a different letter.

sumyungi your help is greatly appreciated. I can't use vectors but am going to post the parameters. your ideas shed a lot of light.

Two one-dimensional arrays of integers must be used to store the frequency counts of each character in each word. Read and store the words in a pair using C++ string class objects. You are required to use these standard one-dimensional arrays and C++ string class objects in this project. You may not use any two-dimensional arrays, structs or user-defined classes on this project. You also may not use the C++ Standard Template Library classes or algorithms with the exception of using the C++ string class to store and work with the word string data in this program

You need to make use of the ASCII codes for characters. The lower case alphabet is represented by the integers 97 through 122, coding 'a' through 'z'. If you store your frequency counts in an array with 26 elements, the indexes will be 0 through 25. Thus you must set up a mapping from characters to your array of counters, making use of the following table:

ALPHA ASCII CODE FREQUENCY COUNTER AT ARRAY INDEX
========================================================

'a' 97 0
'b' 98 1
'c' 99 2
...
'z' 122 25

I don't quite understand how to use the frequency array to process words. If I have word one and word two how do they get paired with the correct number in the array.

could any one explain this in more detail. I see how lower case 'a' has a ASCII code but how does that get loaded with my word. Is the array made up of twenty six letters and they get compared for equality with the word one letter at a time. then that letter is counted.
Is this at all right. Thanks so much for the patience.

Characters are essentially nothing more than 1-byte integers. So if your letter that you were counting were 'a' what operation could you do to yield the index 0? How about for 'b' to yield 1? You'll kick yourself once you get it.

So the array will hold integers and be indexed by the above method. March along your string, once you encounter a letter, do the transformation to get the proper index and increment that particular bin. So if you encounter c you'll increment bin 2 (zero based). Then onto the next letter. Do a quick sketch on a piece of paper with a 4 or 5 letter word. Draw where each of the letters ends up.

This article has been dead for over six months. Start a new discussion instead.