2. Collision Handling

A Collision occurs when two or more different keys produce the same hash value (index). For example, "Budi" and "Dina" are both hashed to row #5 . We can't overwrite Budi's data. We must have a strategy to handle this. 
 2.1. Separate Chaining 
 
 This is the most common strategy to avoid collisions. 
 
 Concept: Instead of each table row storing a single value, each row stores a pointer to another data structure, usually a Linked List . 
 How it Works: 
 
 "Budi" hashes to row 5. We create a linked list at row 5 and add "Budi". 
 "Dina" hashes to row 5. We add "Dina" to the linked list that already exists at row 5. 
 Row 5 now contains: [Budi] -> [Dina] -> NULL 
 
 
 Search: To find "Dina," we compute its hash (5), go to row 5, and then traverse the linked list at that row until we find "Dina." The worst-case search time becomes O(k) where k is the number of elements in the chain. 
 
 2.1.1. Manual Implementation: Insertion 
 The insert function handles adding a new key-value pair. Its logic is: 
 
 Hash the key to find the correct bucket index . 
 Get the linked list (the chain) at that index . 
 Traverse the list:
 
 If the key is already in the list, update its value and stop. 
 If the end of the list is reached and the key was not found, add the new key-value pair to the end of the list. 
 
 
 
 void insert(std::string key, int value) {
 // Hash the key to get the bucket index
 int index = hashFunction(key);

 // Get the chain (linked list) at that index
 std::list<KeyValuePair>& chain = table[index];

 // Traverse the chain
 for (auto& pair : chain) {
 if (pair.key == key) {
 // Key found: update the value and return
 pair.value = value;
 return;
 }
 }

 // Key not found: add new pair to the end of the list
 chain.push_back(KeyValuePair(key, value));
}
 
 2.1.2. Manual Implementation: Searching 
 The search function finds the value associated with a key. Its logic is: 
 
 Hash the key to find the bucket index . 
 Get the linked list at that index . 
 Traverse the list:
 
 If the key is found, return its value . 
 If the end of the list is reached and the key was not found, return a sentinel value (like -1) or throw an exception. 
 
 
 
 int search(std::string key) {
 // Hash the key to get the bucket index
 int index = hashFunction(key);

 // Get the chain at that index
 std::list<KeyValuePair>& chain = table[index];

 // Traverse the chain
 for (auto& pair : chain) {
 if (pair.key == key) {
 // Key found: return the value
 return pair.value;
 }
 }

 // Key not found
 return -1; 
}
 
 2.1.3. Manual Implementation: Deletion 
 The remove function deletes a key-value pair. Its logic is: 
 
 Hash the key to find the bucket index . 
 Get the linked list at that index . 
 Traverse the list using an iterator . We must use an iterator because we need to know the position of the element to delete it. 
 If the key is found:
 
 Call the list's erase() method, passing the iterator. 
 Stop and return. 
 
 
 
 void remove(std::string key) {
 // Hash the key to get the bucket index
 int index = hashFunction(key);

 // Get a reference to the chain
 auto& chain = table[index];
 
 // Traverse the chain using an iterator
 for (auto it = chain.begin(); it != chain.end(); ++it) {
 // 'it' is an iterator (position pointer)
 // 'it->key' accesses the key of the element at that position
 if (it->key == key) {
 // Key found: erase the element at this position
 chain.erase(it); 
 return;
 }
 }
 // If loop finishes, key was not found. Do nothing.
}
 
 2.2. Open Addressing (Probing) 
 This is an alternative strategy where all data is stored within the table itself. No linked lists are used. 
 
 Concept: If a collision occurs (the slot is full), we "probe" for the next empty slot according to a specific rule and place the item there. 
 Types of Probing: 
 
 Linear Probing: This is the simplest rule. If slot h(key) is full, try h(key) + 1 , then h(key) + 2 , h(key) + 3 , and so on, wrapping around the table if necessary. The problem of this type is it tends to create clusters of occupied slots , which degrades performance as the table fills up. 
 Quadratic Probing: If slot h(key) is full, try h(key) + 1^2 , then h(key) + 2^2 , h(key) + 3^2 , and so on. This helps spread out elements better than linear probing. 
 Double Hashing: Uses a second hash function to determine the "step size" for probing, which is the most effective at avoiding clusters.