Can someone point me towards a dead basic, complete example of a hash table utilising chaining? I've got a working hash function, I am just getting really frustrated with not understanding what I don't understand with the setting up of the hash table. All the examples on the web I've looked at I can't get working, or they seem so complicated; pages and pages of code seeing how well it works, and on various inputs, and I just want something dead simple and transparent, that will not crash when I give it a word.

My understanding is that for the hash table with chaining, you'll need a linked list data structure (which I have working fine) and another data structure, that holds each chain and also an index. So I have something like

typedef struct l {
  char* data;
  int numfound;
  struct l *next;
} listelem;

for my linked list, which seems to work well, but I am stuck on what to do for the other data structure. Should it be

typedef struct hashtable {
  struct listelem chain;
  int index;
} hashtable; 

or is that way off base?

The problem I'm having is that I'm developing in Windows at the moment, and when it doesn't work it just crashes, with no inkling of what went wrong, whereas on linux it seems to be easier to catch errors. Or it gives me thousands of errors/warnings on compile, and I can't resolve them.

I've written about three or four insert functions now, using different ideas and bases, and none of them have worked even a little bit. My current m.o. for the insert function is planned thusly:

// 1. hash word, get address
// 2. compare to word in that address, if there is one
// 3. if there is one, check it until next is null
// 4. check by strcmp-ing the data segment and the input string
// 5. if there is no match, add it to the beginning of the chain
// 6. if there is a match, increment the numfound and break

Does that sound reasonable?

Ach. My project is due on Friday and I'm so far behind.. Please help me, good people! I am talking to my tutor tomorrow, hopefully I can get some of this resolved, but I am in a sorry state -- programming six hours a night after an exhausting day just isn't going well. No sleep = no energy = sloppy programming = staying up to fix errors = no sleep, etc.

sad face.

Recommended Answers

All 7 Replies

These are questions for Narue. You could have a look at this Tutorial she wrote : link

[edit] Looking futher below in the threadlist, I saw that you allready asked this question in this thread and got the same answer... Sorry 'bout that

Niek

Even that tutorial has too much going on, I need something far simpler -- if such a thing exists.

Here's as simple as it gets with separate chaining, with some rudimentary testing. I wrote it in a few minutes, so don't blame me if there are bugs. ;)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct node {
  char *data;
  struct node *next;
} node;

unsigned djb_hash ( const void *key, int len )
{
  unsigned const char *p = key;
  unsigned h = 0;
  int i;

  for ( i = 0; i < len; i++ )
    h = 33 * h ^ p[i];

  return h;
}

static int find ( const node *table[], int size, const char *key )
{
  unsigned index = djb_hash ( key, strlen ( key ) ) % size;
  const node *it = table[index];

  /* Find a matching key in the list, if any */
  while ( it != NULL && strcmp ( it->data, key ) != 0 )
    it = it->next;

  return it != NULL;
}

static int insert ( node *table[], int size, const char *key )
{
  if ( !find ( table, size, key ) ) {
    /* Find the linked list we want to search */
    unsigned index = djb_hash ( key, strlen ( key ) ) % size;

    node *new_node = malloc ( sizeof *new_node );

    if ( new_node == NULL )
      return 0;

    new_node->data = malloc ( strlen ( key ) + 1 );

    if ( new_node->data == NULL )
      return 0;

    /* Fill in the data and link to the front of the list */
    strcpy ( new_node->data, key );
    new_node->next = table[index];
    table[index] = new_node;

    return 1;
  }

  return 0;
}

#include <assert.h>

int main ( void )
{
  node *table[101] = {0};

  assert ( insert ( table, 101, "test" ) == 1 );
  assert ( find ( table, 101, "test" ) == 1 );
  assert ( insert ( table, 101, "test" ) == 0 );
  assert ( find ( table, 101, "foo" ) == 0 );

  return 0;
}

Very nice! Thanks so much, that is a lot easier to understand. You rule Narue!

Hi i have a simple problem related to hashing/collisions/chaining but i am stuck with it:

suppose:


    1. i have 33 keys

    1. that i want to map on 11 indexes of a hash table

    1. each index in hash table contains a linear array of size 3

if i do MODULO 11 and generate my hash, i can generate 11 indexes. but considering that my keys are not evenly spaced out, how will i ensure that that each index contains max 3 keys (because of collisions).

First off, in the future, when asking a question, please post a new thread of your own rather than resurrecting one from five years ago.

Second off, the real question here is, what sort of hash function should you use? The fairly simple approach you mention (taking the modulo of the key by the length of the table) has obvious flaws if the keys aren't evenly spaced. Since you need to map the keys to exactly 33 slots, with exactly three slots per index, finding the appropriate hash function is not simple. However, if you know all the keys ahead of time, it would simplify the matter tremendously, because then you could apply the technique known as perfect hashing to generate the hashing function itself.

Unfortunately, most perfect hashing approaches are predicated on their being no collisions, that is, that there are exactly n indexes for n keys; it is mainly useful when the number keys is relatively small compared to the potential key space. Since you haven't told us what the keys are, we can only guess as to the nature of the keys or the size of the key space, though given your original hashing function it sounds as if they are numeric values of some sort.

It may be possible to treat the 33 slots as 33 indices, if you treat the hash as a two-part value mapping to the index of the table and the indices of the arrays, but that would restrict the approaches available to generate the hash.

thanks for your reply. no i do not have information of the keys ahead of time.
yes all the keys are numerics. i have used 33 keys and 11 indeces only for the purpose of example.
My real problem involves 10,000 keys that need to be mapped to a hash table of size 1,000 and each index containing a linear array of size 10. so i need to ensure that not more than 10 keys get mapped to 1 index. Thanks

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.