954,504 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Have something to say? Contribute New Article Reply to this Article

Translator using Hashtable

Hi,
I am trying to use a hash table to sort the two string words from an input file ,display words and again put them back to an output file.

so if this is my input in input.txt:

thank Merci
yes oui
hello bonjour

the sorted output in output.txt will be:

hello bonjour
thank Merci
yes oui

I was wondering if you can help me start it.Thanks!

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

>>"to use a hash table to sort"

I'm not an expert on hashing, but this statement seems to make little sense to me. Hash tables are rarely used to sort anything (which requires a hash function that respects the ordering of the keys, which typically makes it very difficult to also make the hash function respect the distribution of keys (especially word-keys)). Overall, there will be so many collisions (or too many buckets) that it won't be worth it. Use a std::map or std::set instead. Since you want to map English words to French words, I suggest you use std::map. Like so, for example:

#include <map>
#include <fstream>
#include <string>

int main() {
  std::map< std::string, std::string > dictionary;
  std::ifstream inFile("input.txt");
  std::string temp;
  while((inFile >> temp) && 
        (inFile >> dictionary[temp]))
    ;
  //.. then output as you wish, 'dictionary' now holds a sorted list of word-pairs.
  return 0;
};
mike_2000_17
Posting Virtuoso
Moderator
2,139 posts since Jul 2010
Reputation Points: 1,634
Solved Threads: 457
 

what you said makes sense but I dont want to use and want the time efficiency be O(1). So i wrote a code but the program breaks down when i add the french word.I would appreciate it if you help me to to fix the addDef and maybe some suggestions for search Def.thanks:)

#include <cstddef>
#include <cassert>
#include <iostream>
#include <string>
using namespace std;

using std::cout;
using std::cin;
struct Translator {
       string english;
       string french;
       };
       Translator *p;
       Translator *q[99];

void mainMenu();
int hashing (string s);
void  addDef();
void searchDef();
void createNode(string a, string b);
int quit ();
     
int main ()
{
    mainMenu ();
}

void mainMenu ()
{
    int choice;
	
	cout<<"Welcome to English-French translator"<<endl;
	cout<<"What do You want to do ?"<<endl;
	cout<<"1. add Definition to Dictionary"<<endl;
	cout<<"2. Search for Definition"<<endl;
	cout<<"3. Quit"<<endl;
	cout<<"Enter the number you want :  ";
	cin>>choice;
	
	if( choice == 1)
	{   

		addDef();

      
	}
	else if (choice == 2)
	{
		searchDef();
	}
	else if ( choice == 3 )
	{
		quit ();
	}
	else
	{
		cout<<"Input Invalid !"<<endl<<endl;
		mainMenu();
	}
}

void createNode (string a, string b)
{
     p = (Translator*)malloc(sizeof(Translator));
     if (p!=NULL)
     {
        p->english = a;
        p->french  = b;
     }
     else
     {
         cout<<"Memory FULL";
     }
}
     
void addDef ()
{
     string a;
     string b;
     int hashVal;
     
            cout<<"Enter Word in English :"<<endl;
            cin>>a;
            cout<<"Enter the transelation in French :"<<endl;
            cin>>b;
            createNode(a,b);
            hashVal = hashing (p->english);
            q[hashVal] = p;
}

int hashing (string s)
{
    int total = 0;
    int z = 0;
    char cname [99];
    string temp = s;
    
    for ( int i = 0; i <= 99; i++)
    {
    cname[i] =  temp.at(i);
    }
    
    for (int j = 0; j <= 99; j++)
    {
        total = total + (int)cname[j];
    }
    
    z = total % 99;
    return z;
}

int quit ()
{
    return 0;
}

void searchDef(){}
milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

There are a couple of things that need to be fixed.

(1) This is a problem -> p = (Translator*)malloc(sizeof(Translator));

because  malloc  doesn't call  constructors.  This means  that  the  underlying
string  objects  are not  initialized  properly  (perhaps some  internal  pointers
are not set?). Therefore, the program crashes when you try to use them below.

The solution to this is to simply use new instead of malloc -> p = new Translator;


(2) You also have an out-of-bounds problem in your hash function. How do you
know that your  string's  size will  always  be 99? It  should  look  more  like this:

int hashing (string s)
{
    int total = 0;

    for (int j = 0; j < s.size(); j++)
        total = total + (int)s[j];

    return total % 99;
}

Since you've made it this far by yourself, I'll give you a simple implementation for
searchDef, because there is another important problem that I want to point out.

Here's your code slightly modified and with searchDef added:

#include <iostream>
#include <string>
#include <cstdlib>

using namespace std;

struct Translator
{
    string english;
    string french;
};

Translator * dictionary[99] = { 0 };

void mainMenu();
int hash(string s);
void addDef();
void searchDef();
Translator * createNode(string a, string b);
void quit ();

int main()
{
    while (true) mainMenu();
}

void mainMenu()
{
    int choice;

    cout
        << "Welcome to English-French translator\n"
        << "What do You want to do ?\n"
        << "1. add Definition to Dictionary\n"
        << "2. Search for Definition\n"
        << "3. Quit\n"
        << "Enter the number you want :  ";

    cin >> choice;

    switch(choice)
    {
         case 1: addDef();        break;
         case 2: searchDef();     break;
         case 3: quit();          break;
        default: cout << "wrong choice";
    }
}

Translator * createNode (string a, string b)
{
    Translator * new_node = new Translator;

    if (new_node != NULL)
    {
        new_node->english = a;
        new_node->french  = b;
    }
    else
    {
        cout << "Memory FULL";
    }

    return new_node;
}

void addDef ()
{
    string a;
    string b;

    int hashVal;

    cout << "Enter Word in English : ";
    cin  >> a;

    cout << "Enter the transelation in French : ";
    cin  >> b;

    Translator * new_node = createNode(a,b);

    hashVal = hash(a);

    dictionary[hashVal] = new_node;
}

void searchDef()
{
    string a;
    string b;

    int hashVal;

    cout << "Enter Word in English : ";
    cin  >> a;

    hashVal = hash(a);

    if (dictionary[hashVal] && dictionary[hashVal]->english == a)
        cout << dictionary[hashVal]->french << endl;
    else
        cout << "couldn't find word..." << endl;
}

int hash(string s)
{
    int total = 0;

    for (int j = 0; j < s.size(); j++)
        total = total + (int)s[j];

    return total % 99;
}

void quit() { exit(0); }

Try entering hi -> salut and blue -> bleue . Then, try to find salut and bleue using hi and blue . It works fine, right?

Now, try  entering  dog -> chien and god -> dieu  (in that order).  Then, try to find chien using dog .  It doesn't work, right?

(3) The problem here is that you do nothing to take care of hash collisions. dog and god have that
same hash value. As a result,  entering  god -> dieu
overwrites dog -> chien and  vice  versa.

A simple solution to this is to implement your hashmap as an array of std::maps. Here's an
example of how you can do that -> http://www.cplusplus.com/forum/beginner/31394/#msg170354

[EDIT] (Note that I forgot to #include <cstdlib> in the above example...) [/EDIT]

P.S. Using the above approach for implementing you hashmap will also take care of your memory leaks :P

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

Thanks alot! the code helped alot but it still gives me an error.
error C2872: 'hash' : ambiguous symbol
could you please help me find whats the problem

thanks again :)

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

Oh, are you using visual studio? Try replacing hash with hashing in the code above.

Or... What did you mean? Which code doesn't work? The modified version of your code? Or the example in the link?

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

yes,now it says:

error LNK2019: unresolved external symbol "int __cdecl hashing(class std::basic_string,class std::allocator >)" (?hashing@@YAHV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z) referenced in function "void __cdecl addDef(void)" (?addDef@@YAXXZ)

:(

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

oh it worked!! awesome! thanks! now i am going to try to read words from an input file.

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

Replace all occurrences of hash with hashing . Not just the declaration, but the calls and the definition too! :P

EDIT: Oh, ok, you figured it out.

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

Also i want to display all the words.what would be the best way to do this? maybe using a loop?

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

Sure, a loop would be fine. You can use a for loop to scan the dictionary array. In each iteration, if the
current pointer is NULL, do nothing. Else, print what you want (the english word, the french word, or both).

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

is there any example that i can read how to display because i can't find any example in my book.sorry i asked alot of questions!

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

What do you have trouble with in particular? Using for loops? This
will help then -> http://www.cplusplus.com/doc/tutorial/control/#for

Here's some pseudocode:
for ( each element in dictionary )
{
if ( current_element != 0 )
{
print "english word = "
print current_element->english

print ", "

print "french word = "
print current_element->french

print newline
}
}

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

my problem is that i am not sure if the words will be sorted cause i feel like there will be too many collusion.

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

also not sure which elements should be in

for ( each element in dictionary )


.

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

Of course the words won't be sorted. If you want them to be, use std::map instead,
as mike suggested. std::map's lookup time  complexity  isn't that bad (it's O(logn)).

Alternatively, you can stick to your hashmap, but then you'd have to copy your words to another container and
sort them there  whenever  you want to print them, which would not be very wise if you want to print them often.

Another  interesting  option is to make your  hashmap  an array of 26  std::maps  and your hash function work like
this -> If the word's first letter is 'a' put it in dictionary[0], else, if the word's first letter is 'b', put it in dictionary[1], etc...

This way,

(1) dictionary[i] will be a sorted container for every i,

(2) for each i < j, any  word  in  dictionary[i]  will  be
less (lexicographically) than any word in dictionary[j].

m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

ok so could i still use my hashmap to search for words but display them using std::map?

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 

I am a little confused now,so i first sort them and then put them in hashtable? i am not sure how i can use std::maps :( never used it before! could you please provide an example ? thanks! :)

milan2011
Light Poster
37 posts since May 2011
Reputation Points: 10
Solved Threads: 0
 
I am a little confused now,so i first sort them and then put them in hashtable?


Mmm... Not really. When you put them in the hashtable the order is lost.i am not sure how i can use std::maps never used it before! could you please provide an example ?
There are plenty on the net. Try googling "std::map example".

If you want to stick to your hashmap being an array of pointers, it's probably better that you use a set to sort your data
when you want to print it. Though, note that the hash collision problem is still there. You have to address it somehow.

#include <iostream>
#include <string>
#include <set>

using namespace std;

struct is_less
{
    bool operator()(const string * pstr1, const string * pstr2)
    {
        if (pstr1 == 0) return false; // nulls should
        if (pstr2 == 0) return true;  // go at the end

        return *pstr1 < *pstr2;
    }
};

int main()
{
    string * pstr_arr[10] =
    {
        0,
        new string("world"),
        0,
        0,
        new string("hello,"),
        new string("violet"),
        0,
        new string("red,"),
        0,
        0
    };

    typedef set<const string *, is_less> SetType;

    SetType sorted(pstr_arr, pstr_arr + 10);

    SetType::const_iterator cur_it = sorted.begin();
    SetType::const_iterator end_it = sorted.end();

    for (; cur_it != end_it && *cur_it != 0; ++cur_it)
        cout << **cur_it << " ";

    return 0;
}
m4ster_r0shi
Posting Whiz in Training
230 posts since Mar 2010
Reputation Points: 154
Solved Threads: 31
 

The recent C++ standard library provides both a std::map class and a std::unordered_map class. The former implements a binary search tree which gives you a list that is sorted by key values, with O(logN) lookup time. The latter implements a hashtable which does not give you a sorted list, but O(1) lookup time (or almost O(1), because of collisions). Both are used almost the same way.

With std::map:

#include <map>
#include <iostream>
#include <string>

using namespace std;

int main() {
  string english_word, french_word;
  map<string,string> dictionary;
  cout << "Enter English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "Enter French word: ";
    cin >> french_word;
    dictionary[english_word] = french_word;
    cout << "Enter English word (or 'q' to quit): ";
    cin >> english_word;
  };
  cout << "Dictionary is: " << endl;
  for(map<string,string>::const_iterator it = dictionary.begin();
      it != dictionary.end(); ++it)
    cout << it->first << " " << it->second << endl;
  cout << "Enter a query English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "The French translation of '" << english_word
         << "' is '" << dictionary[english_word] << "'." << endl;
    cout << "Enter a query English word (or 'q' to quit): ";
    cin >> english_word;
  };
  return 0;
};


With std::unordered_map:

#include <unordered_map>
#include <iostream>
#include <string>

using namespace std;

int main() {
  string english_word, french_word;
  unordered_map<string,string> dictionary;
  cout << "Enter English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "Enter French word: ";
    cin >> french_word;
    dictionary[english_word] = french_word;
    cout << "Enter English word (or 'q' to quit): ";
    cin >> english_word;
  };
  cout << "Dictionary is: " << endl;
  for(unordered_map<string,string>::const_iterator it = dictionary.begin();
      it != dictionary.end(); ++it)
    cout << it->first << " " << it->second << endl;
  cout << "Enter a query English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "The French translation of '" << english_word
         << "' is '" << dictionary[english_word] << "'." << endl;
    cout << "Enter a query English word (or 'q' to quit): ";
    cin >> english_word;
  };
  return 0;
};


In other words, map and unordered_map are the same as far as usage is concerned.

If you need both at the same time, you can use the fact that the elements of std::map are invariant (the tree nodes remain valid as long as they are not destroyed). This means you can build an unordered_map of map elements. But, frankly, it is wasteful, you should decide whether you need a sorted list or if you need constant-time lookup, because you usually don't need both, and having both usually entails a compromise (e.g. bad hash function with lots of collisions (so, losing O(1) lookup time), or keeping two maps (ordered and not) which wastes memory).

To do both:

#include <map>
#include <unordered_map>
#include <iostream>
#include <string>

using namespace std;

int main() {
  string english_word, french_word;
  map<string,string> sorted_dictionary;
  unordered_map<string,string> lookup_dictionary;
  cout << "Enter English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "Enter French word: ";
    cin >> french_word;
    sorted_dictionary[english_word] = french_word;
    lookup_dictionary[english_word] = french_word;
    cout << "Enter English word (or 'q' to quit): ";
    cin >> english_word;
  };
  cout << "Dictionary is: " << endl;
  for(map<string,string>::const_iterator it = sorted_dictionary.begin();
      it != sorted_dictionary.end(); ++it)
    cout << it->first << " " << it->second << endl;
  cout << "Enter a query English word (or 'q' to quit): ";
  cin >> english_word;
  while(english_word != "q") {
    cout << "The French translation of '" << english_word
         << "' is '" << lookup_dictionary[english_word] << "'." << endl;
    cout << "Enter a query English word (or 'q' to quit): ";
    cin >> english_word;
  };
  return 0;
};
mike_2000_17
Posting Virtuoso
Moderator
2,139 posts since Jul 2010
Reputation Points: 1,634
Solved Threads: 457
 

This article has been dead for over three months

Post: Markdown Syntax: Formatting Help
You
 
View similar articles that have also been tagged: