I am trying to optimize my hash function, for a separate chaining hash table that I have created from scratch. My hash function is as follows:

int hash( const string &key, int tableSize)
        {
            int hashVal = 0;
			for(int i = 0; i<key.length();  i++)
			hashVal = 37*hashVal+key[i];
			hashVal %= tableSize;
			if(hashVal<0)
			hashVal += tableSize;
			return hashVal;
        }

I was wondering if one could point me to either a better hash function or how I should go about modifying this function.

Recommended Answers

All 8 Replies

SAX was designed for hashing strings and is supposed to be really good for it:

unsigned hash(string const& key, unsigned tableSize)
{
    unsigned hashVal = 0;

    for (int x = 0; x < key.length(); ++x)
    {
        hashVal ^= (hashVal << 5) +
                   (hashVal >> 2) +
                   key[x];
    }

    return hashVal % tableSize;
}

Your algorithm is a common variation of a good hash that uses 33 instead of 37. It should work well in the general case.

Thanks, but when I implemented your hash function it took nearly twice as long. Any other functions I should give a whirl?

Thanks, but when I implemented your hash function it took nearly twice as long.

OK, by optimize you mean speed and not collisions. Your algorithm is about as fast as it gets without having excessive collisions or doing micro optimizations. Are you trying to increase the speed because you have proven it to be too slow with a profiler? If not, do not bother because programmers are very bad at guessing where performance bottlenecks are.

Thanks, but when I implemented your hash function it took nearly twice as long. Any other functions I should give a whirl?

Is your call to .length() getting called every time or are you assuming that will get optimized away?

Yes, sorry I meant speed by optimize, I guess I did not clarify. I'm kinda new to hash functions.

I was trying to play around with the FNV function but my program keeps seg faulting

static const unsigned InitialFNV = 2166136261;
static const unsigned FNVMultiple = 16777619;

// /* Fowler / Noll / Vo (FNV) Hash */
unsigned HashTable::hashFunc(const string &s)const
{
    unsigned hash = InitialFNV;
    for(unsigned i = 0; i < s.length(); i++)
    {
        hash = hash ^ (s[i]);       /* xor  the low 8 bits */
        hash = hash * FNVMultiple;  /* multiply by the magic number */
    }
    return hash;
}

This was actually the original function I found online, but I was trying to modify it in order to get it to work with my code. However I am unfamiliar with size_t and the U at the end of the InitialFNV initialization.
So can someone explain the U and size_t please.

static const size_t InitialFNV = 2166136261U;
static const size_t FNVMultiple = 16777619;

/* Fowler / Noll / Vo (FNV) Hash */
size_t myhash(const string &s)
{
    size_t hash = InitialFNV;
    for(size_t i = 0; i < s.length(); i++)
    {
        hash = hash ^ (s[i]);       /* xor  the low 8 bits */
        hash = hash * FNVMultiple;  /* multiply by the magic number */
    }
    return hash;
}

The U means that it's an unsigned integer literal. size_t is an unsigned integer type that's guaranteed to be able to hold the size of an array. But not a std::string. If you're paranoid you should use std::string::size_type. Don't be.

Your program is not segfaulting because of the hash function. The error is elsewhere.

commented: Telling it how it is. Props to you young man. +11
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.