# 2. Collision Handling

A **Collision** occurs when two or more different keys produce **the same hash value** (index). For example, "Budi" and "Dina" are both hashed to **row #5**. We can't overwrite Budi's data. We must have a strategy to handle this.

## 2.1. Separate Chaining

![](https://learn.digilabdte.com/uploads/images/gallery/2025-11/image-1762272337304.png)

This is the most common strategy to avoid collisions.

* **Concept:** Instead of each table row storing a single value, each row stores a pointer to another data structure, usually a **Linked List**.
* **How it Works:**
    1.  "Budi" hashes to row 5. We create a linked list at row 5 and add "Budi".
    2.  "Dina" hashes to row 5. We add "Dina" to the linked list that already exists at row 5.
    3.  Row 5 now contains: `[Budi] -> [Dina] -> NULL`
* **Search:** To find "Dina," we compute its hash (5), go to row 5, and then traverse the linked list at that row until we find "Dina." The worst-case search time becomes `O(k)` where `k` is the number of elements in the chain.

### 2.1.1. Manual Implementation: Insertion

The `insert` function handles adding a new key-value pair. Its logic is:
1.  Hash the `key` to find the correct bucket `index`.
2.  Get the linked list (the chain) at that `index`.
3.  Traverse the list:
    * If the `key` is already in the list, **update** its `value` and stop.
    * If the end of the list is reached and the `key` was not found, **add** the new key-value pair to the end of the list.

```cpp
void insert(std::string key, int value) {
    // Hash the key to get the bucket index
    int index = hashFunction(key);

    // Get the chain (linked list) at that index
    std::list<KeyValuePair>& chain = table[index];

    // Traverse the chain
    for (auto& pair : chain) {
        if (pair.key == key) {
            // Key found: update the value and return
            pair.value = value;
            return;
        }
    }

    // Key not found: add new pair to the end of the list
    chain.push_back(KeyValuePair(key, value));
}
```

### 2.1.2. Manual Implementation: Searching

The `search` function finds the value associated with a key. Its logic is:
1. Hash the `key` to find the bucket `index`.
2. Get the linked list at that `index`.
3. Traverse the list:
    * If the `key` is found, return its `value`.
    * If the end of the list is reached and the `key` was not found, return a sentinel value (like -1) or throw an exception.

```cpp
int search(std::string key) {
    // Hash the key to get the bucket index
    int index = hashFunction(key);

    // Get the chain at that index
    std::list<KeyValuePair>& chain = table[index];

    // Traverse the chain
    for (auto& pair : chain) {
        if (pair.key == key) {
            // Key found: return the value
            return pair.value;
        }
    }

    // Key not found
    return -1; 
}
```

### 2.1.3. Manual Implementation: Deletion

The remove function deletes a key-value pair. Its logic is:
1. Hash the `key` to find the bucket `index`.
2. Get the linked list at that `index`.
3. Traverse the list using an **iterator**. We must use an iterator because we need to know the position of the element to delete it.
4. If the `key` is found:
    * Call the list's `erase()` method, passing the iterator.
    * Stop and return.

```cpp
void remove(std::string key) {
    // Hash the key to get the bucket index
    int index = hashFunction(key);

    // Get a reference to the chain
    auto& chain = table[index];
    
    // Traverse the chain using an iterator
    for (auto it = chain.begin(); it != chain.end(); ++it) {
        // 'it' is an iterator (position pointer)
        // 'it->key' accesses the key of the element at that position
        if (it->key == key) {
            // Key found: erase the element at this position
            chain.erase(it); 
            return;
        }
    }
    // If loop finishes, key was not found. Do nothing.
}
```

## 2.2. Open Addressing (Probing)

This is an alternative strategy where all data is stored within the table itself. No linked lists are used.
- **Concept:** If a collision occurs (the slot is full), we "probe" for the next empty slot according to a specific rule and place the item there.
- **Types of Probing:**
    1. **Linear Probing:** This is the simplest rule. If slot `h(key)` is full, try `h(key) + 1`, then `h(key) + 2`, `h(key) + 3`, and so on, wrapping around the table if necessary. The **problem** of this type is it tends to **create clusters of occupied slots**, which degrades performance as the table fills up.
    2. **Quadratic Probing:** If slot `h(key)` is full, try `h(key) + 1^2`, then `h(key) + 2^2`, `h(key) + 3^2`, and so on. This helps spread out elements better than linear probing.
    3. **Double Hashing:** Uses a second hash function to determine the "step size" for probing, which is the most effective at avoiding clusters.