I have an assignment I'm working on in C but I didn't want to ask for any code specifically. I just wanted to ask about the methods to use. The program reads a text file where it tells you the number of values to hash. The hash function is completely universal so the text file doesn't tell you how large the table really needs to be. My idea was to go through the whole text file before inserting any values in order to find the largest hash value, so I know how large the hash table will be and dynamically allocate the memory for that size. How is this method?

Also, in order to prevent collisions I'm going to set up separate hash tables per linked list node, but I'm not sure how many nodes I should set up which equates to the number of possible collisions.
*edit, am I just creating a new node in the case of a collision? That would make sense.

If I'm way off track, please tell me as this is the first time I'm dealing with these topics. Thank you.

In practice you never know what's a number of a future collisions. The first pass through the data file is not the best idea too. You can't estimate a number of different words without some kind of a dictionary so you have the classical chicken and eggs problem. In fact you want to build hash table based dictionary so it's an useless pass.

That's one of possible solutions: place all words with the same hash codes in the list started in the correspondent hash table cell. Your hash function must calculate hash codes for the previously selected (arbitrary) hash table size. Resolve hash code collisions as usually (there are lots of method to do it). For example, see
http://en.wikipedia.org/wiki/Hash_table

Alright. After reading through the wikipedia page, and comparing with my class notes I realized I was confusing what a hash table was. Let me know if I'm right...A hash table is an array of strings indexed by their hash value. Chaining is using an array of structs to store the strings idexed by their hash value. Whenever there is a collision, add another node to the struct of that index and add the new value. Right? *crosses fingers*

This article has been dead for over six months. Start a new discussion instead.