943,735 Members | Top Members by Rank

Ad:
  • C Discussion Thread
  • Unsolved
  • Views: 14588
  • C RSS
You are currently viewing page 1 of this multi-page discussion thread
Mar 9th, 2007
-1

Simple Hash Function for Integer Arrays

Expand Post »
My research group is trying to implement a hash table to reduce the effects of a very costly step in our calculations.

Basically, it can take more than an hour to go from A to B, and B is something ridiculously simple, like a number between 1 and 100.

A in this case is an integer array with the following attributes:
  • any entry can be either positive or negative
  • average of 20 entries per array
  • all entries are even
  • with k elements in an array, values range from (+/-)2...2k, with no repeats
Examples:
  • {4, -2, 6, 10, -8}
  • {-2, -4, -6}
  • {6, 4, -2}
I don't have an exact figure, but I think we have around 5 million distinct arrays to hash here.

What is a good hashing function that would work and still be reversible?

Thanks,
natnit
Reputation Points: 10
Solved Threads: 0
Newbie Poster
natnit is offline Offline
3 posts
since Mar 2007
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

You will need 5 million different hash tables.
Which meanns that if you are lokking for a given value, say -4, across all of your arrays you will have to consult each hash to see if -4 is in the array and which element it is.

This will result in some performance gain.
I'm assuming that the position of the element and the value in the in the array both have meaning.

It sounds like the data design needs to be rethought -- more than that a hash table will fix your problems. For example, a database can do
highly optimized lookups, like count the number of element #3 that equals 20.

I dunno, don't know enough to about your problem to help, really.

Narue has a site
www.eternallyconfuzzled.com
that has a good explanation of hashing algorithms.

UNIX systems usually come with hcreate, hsearch as an implementation of using Knuth's
original hashing work.

I personally think you've got a data design problem. Possibly you should have put in associative arrays upfront as part of the design.
Hashing added after the fact may only help a little. Depending on how far down the call tree your lookups are embedded in your code.
Reputation Points: 62
Solved Threads: 10
Junior Poster
jim mcnamara is offline Offline
179 posts
since May 2004
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

Are these arraya your keys to the map/list? Can there be any repetition?
If there is repetition, is it needed to hold all instances (e.g. like in a std::multimap) or only one key should be there (like in std::map)?

What's the max number of entries per array?

How abt picking up the first value bit (not the sign) from first 24 entries in the array? You'll have a 3 byte key that can represent values more than 5 mil. (of course for lesser number of entries you pickup more bits).
Reputation Points: 254
Solved Threads: 74
Practically a Posting Shark
thekashyap is offline Offline
804 posts
since Feb 2007
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

You will need 5 million different hash tables.........I personally think you've got a data design problem. Possibly you should have put in associative arrays upfront as part of the design.
Hashing added after the fact may only help a little. Depending on how far down the call tree your lookups are embedded in your code.
Completely agree with Jim on this. Add to this if you're hoping to make a lookup faster in an array of 20 elements, I don't think you're looking at major performance gains.

So I just hope tht these arrays represent your "keys" and not "values".

If this is NOT the case and you still wanna introduce hashing then read on.
If your key is ranging from -2k..+2k (lets discount -2..+2 for now) where avg k is 0 and max (guessing) is say 50. AND you don't have memory problems. Then you can simply strip the sign bit, divide by 2 and use the number itself as key, probability of collision is less than 5% if I'm not wrong, but that you gotta test anyway.

E.g. here are the hash values for your examples:
{4, -2, 6, 10, -8}
hash= {2, 1, 3, 5, 4}
{-2, -4, -6}
hash= { 1, 2, 3}
{6, 4, -2}
hash= {3, 2, 1}
Reputation Points: 254
Solved Threads: 74
Practically a Posting Shark
thekashyap is offline Offline
804 posts
since Feb 2007
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

Thanks for the reply. The picture you paint of the current situation is quite grim.

I was thinking more along the lines of a string conversion from the array. I'm assuming there are string to int hash functions, and a string is nothing more than an array of characters. So all I'd need is something to map -2, 8, 4 etc to chars.

I am still confused about your reasoning on why I need 5 million hash tables, though.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
natnit is offline Offline
3 posts
since Mar 2007
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

Click to Expand / Collapse  Quote originally posted by natnit ...
Thanks for the reply. The picture you paint of the current situation is quite grim.

I was thinking more along the lines of a string conversion from the array. I'm assuming there are string to int hash functions, and a string is nothing more than an array of characters. So all I'd need is something to map -2, 8, 4 etc to chars.

I am still confused about your reasoning on why I need 5 million hash tables, though.

Nothing easier than to map integers onto characters; a look up table would do it, but how would that help?
Reputation Points: 14
Solved Threads: 4
Junior Poster
mathematician is offline Offline
149 posts
since Nov 2006
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

You did not explain your full data design. So the answer as to why 5 million hash tables =

one hash for one array.

Explain how you need to lookup values in each array. It would help. It might be possible to hash across all arrays.

BUT.

Your collision rate will be out of this world because you must have zillions of duplicates.
Either duplicate element # in the array or duplicate values in the array. This assumes you do not need to lookup based on the entire array's contents. In that second case, hashing might actually help somewhat.
Last edited by jim mcnamara; Mar 9th, 2007 at 4:15 pm.
Reputation Points: 62
Solved Threads: 10
Junior Poster
jim mcnamara is offline Offline
179 posts
since May 2004
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

Ok, let me refine my question, since I think there is a communication problem here somewhere.

Assume I have lots of array values that I want to be able to look up easily.

Each array is at most 60 integers long, and for the sake of argument, I have 1 million arrays.

All I'm looking for is a hash function to produce keys. For example:

{2, 4, 6} -> 3821
{2, 6, 4} -> 91313
{4, -2} -> 1243

I'm sure that there should be something, because going from int[] to int shouldn't be that different from going from char[] to int...

I hope that made it clearer.
Reputation Points: 10
Solved Threads: 0
Newbie Poster
natnit is offline Offline
3 posts
since Mar 2007
Mar 9th, 2007
0

Re: Simple Hash Function for Integer Arrays

Click to Expand / Collapse  Quote originally posted by natnit ...
Ok, let me refine my question, since I think there is a communication problem here somewhere.

Assume I have lots of array values that I want to be able to look up easily.

Each array is at most 60 integers long, and for the sake of argument, I have 1 million arrays.

All I'm looking for is a hash function to produce keys. For example:

{2, 4, 6} -> 3821
{2, 6, 4} -> 91313
{4, -2} -> 1243

I'm sure that there should be something, because going from int[] to int shouldn't be that different from going from char[] to int...

I hope that made it clearer.


I will suggest a possible approach rather than give an answer. Think of the values contained in the array as being the digits in a number with an appropriately large base. The hash value, or at least a basis for it, could then be the value of that number.
Reputation Points: 14
Solved Threads: 4
Junior Poster
mathematician is offline Offline
149 posts
since Nov 2006
Mar 10th, 2007
0

Re: Simple Hash Function for Integer Arrays

Click to Expand / Collapse  Quote originally posted by thekashyap ...
Are these arrays your keys to the map/list? Can there be any repetition?
If there is repetition, is it needed to hold all instances (e.g. like in a std::multimap) or only one key should be there (like in std::map)?

What's the max number of entries per array?
If you answer a few questions may be you'll get more/faster help..
Reputation Points: 254
Solved Threads: 74
Practically a Posting Shark
thekashyap is offline Offline
804 posts
since Feb 2007

This thread is more than three months old

No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.
This thread is currently closed and is not accepting any new replies.
Previous Thread in C Forum Timeline: how to search pattern in file using C
Next Thread in C Forum Timeline: Pointer array Question





About Us | Contact Us | Advertise | Acceptable Use Policy
Forum Index | Build Custom RSS Feed


Follow us on Twitter


© 2011 DaniWeb® LLC