How I view hash tables
I wrote this for Hannah hopefully this helps!
[0]
[1]
[2]
[3]
[4]->[Abkhazia| Sukhumi] -> [Albania| Tirana]
[5]
[6]
[7]
[8]
Abkhazia
Afghanistan
Albania
Abkhazia
A + b + k + h + a + z + i + a
1 + 2 + 11 + 8 + 1 + 26 + 9 + 1
= 59
59(mod 9) = 5
These represent my “buckets” or linked lists.
The reason I use mod 9 is that it limits the amount of buckets that I have so I will only have 9 possible areas a key will go to.
[0]
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
If I were to have another value for instance if I had another value Albania(lets pretend that it equals 59) then it would be appended to the 4 bucket.
Then you have to change the next node of abkhazia to point to albania
[4]->[Abkhazia| Sukhumi] -> [Albania| Tirana]
Let me go over what c does.
Note:
C is basically an iterator. It lets you walk through the linked list
Okay so you have a String key correct?
What you should do is hash the key.
Let’s use the same example I used of Abkhazia.
Okay so I hash Albania and I get 5.
The reason why you’re going to use the 1b. part String find method is because
Abkhazia also has also the same hash key of 5. You have to search through the entire 5 bucket.
Now that we now it’s hash key 5. Now we’re going to search the
[4]->[Abkhazia| Sukhumi] -> [Albania| Tirana]
Okay let’s start at the head. Is Abkhazia == Albania? Nope okay let’s go to the next value.
Is Albania == Albania? True!
So my hash table does contain the value
so the pseudo code
i = hash(key)
if(HashElement[i].find(key))
return True // It was found.
else
return False // It was not found #
Note:
The point of a hash table is reduce the search space. For example something like Papua New Guinea may be placed into the 9 bucket so you don’t even have to worry about.
A good HashTable though contains a large amount of buckets.
The goal of a hash table is to have a lot of key value pairs and not lot of collisions.
A collision is when you have a bunch of elements in a bucket
For instance what if that
[0] -> Boulder City -> Detroit.
[1]
[2]
[3]
[4] -> Los Angeles -> San Diego -> Miami -> New York -> D.C.
This is a bad hash table.
Why? It has a lot of collisions. 4 has a lot of values. That’s inefficient.
The optimum linked list would be something like this.
[0] Boulder City
[1] Detroit
[2] Los Angeles
[3] San Diego
[4] Miami
I excluded the other cities for brevity but this should give you the gist of how it should be.
Why is this optimum?
A good HashTable though contains a large amount of buckets.
The goal of a hash table is to have a lot of key value pairs and not lot of collisions.
A collision is when you have a bunch of elements in a bucket
For instance what if that
[0] -> Boulder City -> Detroit.
[1]
[2]
[3]
[4] -> Los Angeles -> San Diego -> Miami -> New York -> D.C.
Speed Speed Speed!
Because accessing elements in an array is quick!
When you hash Boulder City. You just need to do a quick [0] array access.
The problem happens when you have a situation like this
[4] -> Los Angeles -> San Diego -> Miami -> New York -> D.C.
This reduces the speed of a hash table drastically because going through a linked list is very slow.
If this occurs the way you improve the hash table is to create a new hashing function and to increase the amount of buckets of “hash elements” that you have