Tech-letter #17 | April 3, 2023
Hashmap
-
A hashtable, also known as a
hash map
, is a data structure used to implementassociative arrays or mappings of key-value pairs
.What are associative arrays?
Associative arrays, also known as maps, dictionaries, or hash tables, are data structures used in programming to store collections of key-value pairs. In an associative array, each element is a key-value pair, where the key is used to access the associated value.
-
In a hashtable, data is stored in an array-like structure, where each element of the array is called a bucket or slot. Each bucket can hold one or more key-value pairs. The
keys are hashed using a hash function
to determine the index of the bucket where the corresponding value will be stored. -
The hash function
maps the key to an integer
, which isused as an index
to access the appropriate bucket. -
The hash function is designed to evenly distribute keys across the buckets, minimizing collisions (when two keys map to the same index). In the case of collisions, the values can be chained together in a linked list or other data structure.
-
The hashtables are often used to provide constant time (O(1)) access to values based on their keys. This makes them ideal for tasks like caching, indexing, and searching.
Now, let’s implement a hashmap
class Hashmap() {
constructor() {
this.hashmap = {};
}
set(key, value) {
this.hashmap[key] = value;
}
get(key) {
return this.hashmap[key];
}
contains(key) {
return this.hashmap.hasOwnProperty(key);
}
clear() {
this.hashmap = {};
}
}
The above code is a simple implementation of a hashmap in javascript. It is simple because the keys are not hashed using a hash function. You can also create a hashmap using inbuilt Map()
function. The following code shows how to implement a hashmap using Map()
const myMap = new Map();
myMap.set('key1', 'value1');
const value1 = myMap.get('key1');
const hasKey1 = myMap.has('key1'); // true
const hasKey2 = myMap.has('key3'); // false
console.log(myMap);
What are collisions? How to avoide it?
-
collisions occur when two or more keys are hashed to the same index in the hash table.
-
To avoid collisions, hash maps use a technique called “chaining”. Chaining involves storing multiple key-value pairs in the same index of the hash table. Each index of the hash table contains a linked list of key-value pairs that hash to that index. When a collision occurs, the new key-value pair is added to the end of the linked list.
Now, we will implement a hashmap which is basically collision free. A collision-free hash map is also known as a perfect hash map.
class PerfectHashMap {
constructor(keys, values) {
this.size = keys.length;
this.array = new Array(this.size).fill(null);
const hashValues = keys.map(key => this.hash(key));
const collisions = this.findCollisions(hashvalues);
if (collisions === 0) {
hashValues.forEach((hashValue, index) => {
this.array[hashValue] = values[index];
});
} else {
collisions.forEach((collision, index) => {
const subkeys = keys.filter((key, i) => collision.includes(hashValues[i]));
const subValues = values.filter((value, i) => collision.includes(hashValues[i]));
this.array[collision[0]] = new PerfectHashMap(subKeys, subValues);
});
}
}
}
Explanation
:
-
Compute the size of the array based on the number of keys.
-
Create the array and initialize all values to null.
-
Compute the hash values of each key.
-
Check for collisions.
-
If there are no collisions, store the values in the array
Hope this much is clear. There is no complexity till now. The complexity occurs when you get a collision. So what to do when you have a collision. The code in the else block does the work when you encounter with collisions.
collisions.forEach((collision, index) => {
const subkeys = keys.filter((key, i) => collision.includes(hashValues[i]));
const subValues = values.filter((value, i) => collision.includes(hashValues[i]));
this.array[collision[0]] = new PerfectHashMap(subKeys, subValues);
});
-
Here, we are iterating over the collisions array using the forEach method. The collisions array contains arrays of indices where the keys in the hash map have collided.For each collision, we first use the filter method to extract the
subkeys
andsubvalues
from the original keys and values arrays, respectively. We extract only those keys and values that correspond to the indices in the current collision. -
Next, we create a new instance of the PerfectHashMap class using the subkeys and subvalues, and set it as the value for the first index in the current collision. We use the collision[0] index because it is the one that the current keys have collided on.
-
we are creating a new instance of a perfect hash map for each collision that has occurred. This is because the keys in each collision cannot be stored in a single array without collisions, so we need to create a separate perfect hash map for each sub-set of keys that have collided.
Let’s continue implementing a perfect hashmap.
class PerfectHashMap {
constructor(keys, values) {
// all the code mentioned above
}
hash (key) {
let hash = 0;
for (let i = 0; i < key.length; i++) {
hash += key.charCodeAt(i);
}
return hash % this.size;
}
}
-
The above method
accepts string
as key and returns a hash code as an integer. -
The hash code is calculated by iterating over each character in the string, summing up the Unicode values of those characters and taking the remainder of the sum with the size of the hash table.
Steps:
1. The hash code hash is initialized to 0.
2. The loop iterates over each character in the key string, one at a time.
3. For each character, the Unicode value of that character is added to the hash code.
4. After all characters have been processed, the hash code is returned modulo the size of the hash table. This ensures that the hash code falls within the range of valid indices for the hash table.
For example, let’s say we have a hash table with a size of 10 and we want to hash the key “apple”. Here’s how the hash function works with each iteration of the loop:
// size of hash table is 10
hash("apple")
// 1
i = 0
hash = 0 + 97 = 97
// 2
i = 1
hash = 97 + 112 = 209
// 3
i = 2
hash = 209 + 112 = 321
// 4
i = 3
hash = 321 + 108 = 429
// 5
i = 4
hash = 429 + 101 = 530
// Return the hash code modulo the size of the hash table
hash % 10 = 0
// Output: 1
Understanding the findCollisions method
class PerfectHashMap {
constructor(keys, values) {
// contructor
}
hash(key) {
// hash method
}
findCollisions(hashValues) {
const collisions = [];
hashValues.forEach((hashValue, index) => {
const otherIndices = hashValues.filter((value, i) => i !== index);
const collision = otherIndices.findIndex(value => value === hashValue);
if (collision >= 0) {
collisions.push([index, collision + 1]);
}
});
return collisions;
}
}
-
The collisions array is initialized to an empty array.
-
The hashValues array is iterated over using the forEach method.
-
For each hash value, a new array called otherIndices is created by filtering out the current hash value using the filter method. This array contains all the other hash values in the input array.
-
The findIndex method is used to find the first index in the otherIndices array that matches the current hash value. If no match is found, collision is set to -1.
-
If a collision is found (i.e. collision is not -1), the pair of indices [index, collision + 1] is added to the collisions array. Note that we add 1 to collision because we want the index in the otherIndices array to be relative to 1, not 0.
-
After all hash values have been processed, the collisions array is returned.
Now let’s write other methods that is set
and get
methods to complete the perfect hash map:
class PerfectHashMap {
constructor(keys, values) {
// constructor
}
hash(key) {
// hash
}
findCollisions(hashValues) {
// findCollisions
}
get(key) {
const hashValue = this.hash(key);
const value = this.array[hashValue];
if (value instanceof PerfectHashMap) {
return value.get(key);
}
return value;
}
set(key, value) {
const hashValue = this.hash(key);
const currentValue = this.array[hashValue];
if (currentValue instanceof PerfectHashMap) {
currentValue.set(key, value);
} else {
this.array[hashValue] = value;
}
}
}
Building an application using Hashmap
Let’s build a simple function which counts the vowels and return the highest vowel used based on it’s count.
function countVowels(sentence) {
const vowels = 'aeiou';
const vMap = new Map();
let highestCount = 0;
let highestVowel = '';
for (let char of sentence) {
char = char.toLowerCase()
if (vowels.includes(char)) {
const count = vMap.get(char) || 0;
const newCount = count + 1;
vMap.set(char, newCount);
if (newCount > highestCount) {
highestCount = newCount;
highestVowel = char;
}
}
}
return { highestVowel, highestCount};
}
const sentence = 'This is a sentence used to test the above function';
const result = countVowels(sentence);
console.log(result); // { highestVowel: 'e', highestCount: 7 }