# Algorithm Programming - CE This book contains the theoretical foundations for programming algorithm practicum for the Computer Engineering department at UI. # Module 2 - Linked List Theoretical basics of Linked Lists, their types, benefits, and example code. # Part 1 - Understanding Linked List #### Definition of Linked List A Linked List is a linear data structure consisting of elements called nodes. Each node has two main parts: - **Data** – stores the value of the element. - **Pointer/Reference** – points to the next node (or the previous node in a Doubly Linked List). #### Differences Between Linked List and Array

Aspect	Array	Linked List
Storage	Elements are stored contiguously in memory.	Nodes are stored in non-contiguous memory and linked by pointers.
Element Access	Direct access using an index, complexity O(1).	Access requires traversal from the start, complexity O(n).
Size	Static, size determined at declaration.	Dynamic, can grow or shrink as needed.
Insert/Delete	Expensive due to element shifting, complexity O(n).	More efficient, only pointer adjustments needed, complexity O(1) if position is known.

#### Benefits of Linked List **Dynamic Size** - No need to define size in advance like an array. - Can grow or shrink according to program needs. **Efficiency in Insert/Delete Operations** - Adding or removing elements does not require shifting data. - Complexity can be O(1) if the pointer is known. **More Flexible Memory Usage** - Nodes can be stored in scattered memory locations. - Useful in systems with fragmented memory. **Support for Complex Data Structures** - Forms the basis for other data structures such as Stack, Queue, Deque, and Graph. - Enables creation of variants like Circular Linked List and Doubly Linked List. **Two-Way Traversal (in Doubly Linked List)** - Makes navigation forward and backward easier. #### Applications of Linked Lists **Dynamic Memory Allocation** - Used to keep track of free and allocated memory blocks. **Undo/Redo Operations in Text Editors** - Stores changes and navigates through editing history. **Graph Representation** - Efficiently implements adjacency lists in graph structures. **Implementation of Other Data Structures** - Serves as a base for stacks, queues, and deques. #### Advantages of Linked List - Efficient insertion and deletion since shifting of elements is not required as in arrays. - Dynamic size allows growth or shrinkage at runtime. - Memory utilization is more efficient; no pre-allocation reduces wasted space. - Suitable for applications with large or frequently changing datasets. - Uses non-contiguous memory blocks, useful in memory-limited systems. #### Disadvantages of Linked List - Direct access to an element is not possible; traversal from the start is required. - Each node requires extra memory to store a pointer. - More complex to implement and manage compared to arrays. - Pointer mismanagement can cause bugs, memory leaks, or segmentation faults. #### Short Code Example ```c #include using namespace std; struct Node { int data; Node* next; }; void pushFront(Node*& head, int value) { Node* newNode = new Node(); // Buat node baru newNode->data = value; // Isi data newNode->next = head; // Hubungkan ke head lama head = newNode; // Jadikan node baru sebagai head } void printList(Node* head) { Node* current = head; while (current != nullptr) { cout << current->data << " -> "; current = current->next; } cout << "NULL\n"; } int main() { Node* head = nullptr; // List kosong pushFront(head, 10); pushFront(head, 20); pushFront(head, 30); printList(head); // Output: 30 -> 20 -> 10 -> NULL return 0; } ``` **Code Explanation:** - `Node` is the basic structure of a Linked List. - `pushFront` adds a new node at the beginning of the list. - `printList` traverses the list and displays all elements. # Part 2 - Types of Linked List #### Types of Linked Lists Based on the structure of linked lists, they can be classified into several types: - **Singly Linked List** - **Doubly Linked List** - **Circular Linked List** #### 1. Singly Linked List The singly linked list is the simplest form of linked list, where each node contains two members: **data** and a **next pointer** that stores the address of the next node. Each node in a singly linked list is connected through the next pointer, and the next pointer of the last node points to `NULL`, denoting the end of the list. #### Singly Linked List Representation A singly linked list can be represented as a pointer to the first node, where each node contains: - **Data:** Actual information stored in the node. - **Next:** Pointer to the next node. ```cpp // Structure to represent the singly linked list struct Node { int data; // Data field struct Node* next; // Pointer to the next node }; ``` #### 2. Doubly Linked List The doubly linked list is a modified version of the singly linked list. Each node contains three members: **data**, **next**, and **prev**. The `prev` pointer stores the address of the previous node. Nodes are connected via both `next` and `prev` pointers. The `prev` pointer of the first node and the `next` pointer of the last node point to `NULL`. #### Doubly Linked List Representation Each node contains: - **Data:** Actual information stored. - **Next:** Pointer to the next node. - **Prev:** Pointer to the previous node. ```cpp // Structure to represent the doubly linked list struct Node { int data; // Data field struct Node* next; // Pointer to the next node struct Node* prev; // Pointer to the previous node }; ``` #### 3. Circular Linked List A circular linked list is similar to a singly linked list, except the next pointer of the last node points to the **first node** instead of `NULL`. This circular nature is useful in applications like media players. #### Circular Linked List Representation Each node contains: - **Data:** Actual information stored. - **Next:** Pointer to the next node; the last node points to the first node. ```cpp // Structure to represent the circular linked list struct Node { int data; // Data field struct Node* next; // Pointer to the next node }; ``` # Part 3 - Searching #### 1. Searching in a Custom Singly Linked List ```cpp #include using namespace std; // Linked list node class Node { public: int key; Node* next; }; // Add a new node at the front void push(Node** head_ref, int new_key) { Node* new_node = new Node(); new_node->key = new_key; new_node->next = (*head_ref); (*head_ref) = new_node; } // Search for a value in the linked list bool search(Node* head, int x) { Node* current = head; while (current != NULL) { if (current->key == x) return true; current = current->next; } return false; } int main() { Node* head = NULL; int x = 21; // Construct list: 14->21->11->30->10 push(&head, 10); push(&head, 30); push(&head, 11); push(&head, 21); push(&head, 14); search(head, 21) ? cout << "Yes" : cout << "No"; return 0; } ``` **Explanation:** - `Node` class represents each node in the linked list. - `push` adds a new node at the beginning. - `search` traverses the list iteratively to find the key. #### 2. Searching in STL `std::list` C++ Standard Template Library (STL) provides the `list` container which can be used to implement a linked list. Searching can be done using the `std::find` algorithm. ```cpp #include #include #include using namespace std; int main() { list l = {14, 21, 11, 30, 10}; int x = 21; auto it = find(l.begin(), l.end(), x); if (it != l.end()) cout << "Yes"; else cout << "No"; return 0; } ``` **Explanation:** - `std::list` is a doubly linked list implementation. - `std::find` iterates through the list to locate the element. - Returns an iterator to the element if found, or `l.end()` if not found. # Part 4 - Manual VS STL List A **doubly linked list** is a data structure where each node contains a pointer to the **next** and **previous** nodes, allowing traversal in both directions. In C++, you can implement it manually or use the built-in STL `std::list`. **Differences between Manual Doubly Linked List and STL `std::list`:**

Feature	Manual Doubly Linked List	STL `std::list`
Implementation	You define the `Node` structure and pointers manually.	Already implemented as a doubly linked list in the STL.
Memory Management	Manual allocation and deallocation using `new` and `delete`.	Automatic memory management.
Operations	Insert, delete, traversal, and search must be implemented manually.	Provides built-in functions like `push_back`, `push_front`, `erase`, and `find`.
Complexity	More control but more prone to errors like memory leaks.	Safer and easier to use, less prone to bugs.

#### 1. Manual Doubly Linked List Example ```cpp #include using namespace std; struct Node { int data; Node* next; Node* prev; }; // Add a node at the end void pushBack(Node*& head, int value) { Node* newNode = new Node{value, nullptr, nullptr}; if (!head) { head = newNode; return; } Node* temp = head; while (temp->next) temp = temp->next; temp->next = newNode; newNode->prev = temp; } // Print the list void printList(Node* head) { Node* temp = head; while (temp) { cout << temp->data << " "; temp = temp->next; } cout << endl; } int main() { Node* head = nullptr; pushBack(head, 10); pushBack(head, 20); pushBack(head, 30); printList(head); // Output: 10 20 30 } ``` #### 2. STL `std::list` Example ```cpp #include #include using namespace std; int main() { list l; l.push_back(10); l.push_back(20); l.push_back(30); for (int x : l) cout << x << " "; // Output: 10 20 30 cout << endl; } ``` **Explanation:** - Manual doubly linked list requires defining nodes and managing pointers yourself. - `std::list` simplifies everything: memory management and operations like insertion, deletion, and traversal are built-in. # Module 4 - Merge Sort and Quick Sort # 1. Understanding Merge Sort ### What is Merge Sort? **Merge Sort** is an efficient, comparison-based sorting algorithm that uses a **"divide and conquer"** strategy. In simple terms, it repeatedly breaks down a list into several sub-lists until each sub-list contains only one item (which is considered sorted). Then, it merges those sub-lists back together in a sorted manner. --- ### How Does it Work? The process can be broken down into two main phases: 1. **Divide Phase:** The main, unsorted list is recursively divided in half. This continues until all the resulting sub-lists have only one element. 2. **Conquer (Merge) Phase:** Starting with the single-element lists, the algorithm repeatedly merges pairs of adjacent sub-lists. During each merge, it compares the elements from the two lists and combines them into a new, larger, sorted list. This continues until all the sub-lists have been merged back into a single, completely sorted list. --- ### Visualization Explanation [![Merge-sort-example-300px.gif](https://learn.digilabdte.com/uploads/images/gallery/2025-09/merge-sort-example-300px.gif)](https://learn.digilabdte.com/uploads/images/gallery/2025-09/merge-sort-example-300px.gif) 1. **The Start:** The animation begins with an unsorted list of numbers: `[6, 5, 3, 1, 8, 7, 2, 4]`. 2. **The "Divide" Phase:** * **First Split:** The entire list is split into two halves: `[6, 5, 3, 1]` and `[8, 7, 2, 4]`. * **Second Split:** Each of those halves is split again: `[6, 5]`, `[3, 1]` and `[8, 7]`, `[2, 4]`. * **Final Split:** The splitting continues until we have eight separate lists, each containing just one number: `[6]`, `[5]`, `[3]`, `[1]`, `[8]`, `[7]`, `[2]`, `[4]`. At this point, the "divide" phase is complete. A list with one item is inherently sorted. 3. **The "Conquer (Merge)" Phase:** Now, the algorithm starts merging these small lists back together in sorted order. * **First Merge:** It merges adjacent pairs. `[5]` and `[6]` are compared and merged to form `[5, 6]`. `[1]` and `[3]` are merged to form `[1, 3]`. Similarly, `[7, 8]` and `[2, 4]` are created. We now have four sorted sub-lists: `[5, 6]`, `[1, 3]`, `[7, 8]`, and `[2, 4]`. * **Second Merge:** The process repeats with these new, larger sorted lists. `[5, 6]` and `[1, 3]` are merged. The algorithm compares the first element of each list (`1` vs `5`), takes the `1`, then compares `3` and `5`, takes the `3` --- ### Benefits of Merge Sort * **Predictable Time Complexity**: Its greatest strength is its consistent **$O(n \log n)$** time complexity, regardless of the initial order of elements (worst, average, and best cases are the same). This makes it very reliable for large datasets. * **Stable Sort**: Merge Sort is **stable**, meaning that if two elements have equal values, their relative order in the original array will be preserved in the sorted array. * **Excellent for External Sorting**: It's highly effective for sorting data that is too large to fit into memory (e.g., files on a hard drive) because it reads and processes data in sequential chunks. * **Parallelizable**: The "divide" step is easy to parallelize, meaning different parts of the array can be sorted simultaneously on different processor cores, speeding up the process significantly. --- ### Drawbacks of Merge Sort * **Space Complexity**: Its main disadvantage is that it requires extra space. It needs an auxiliary array of the same size as the input array, giving it a space complexity of **O(n)**. This can be a limiting factor in memory-constrained environments like embedded systems. * **Slower for Small Datasets**: For small lists, the overhead of recursion makes it slower than simpler algorithms like Insertion Sort. This is why many optimized sorting libraries use a hybrid approach. * **Recursive Overhead**: The recursive function calls add some overhead to the execution stack, which, while usually minor, can be a consideration. --- ### What is Merge Sort Used For? Merge Sort's stability and efficiency with large datasets make it very useful in practice. * **Sorting Linked Lists**: It is one of the most efficient ways to sort linked lists. Unlike arrays, linked lists are not easily accessible by index, but merging them is a very efficient operation. * **External Sorting**: As mentioned, it's a go-to algorithm for sorting massive files that don't fit in RAM. * **Inversion Count Problem**: The algorithm can be modified to solve other problems, such as counting the number of inversions in an array efficiently. * **Standard Library Implementations**: It forms the basis of many highly optimized, built-in sorting functions in various programming languages, often as part of a hybrid algorithm like **Timsort** (used in Python and Java), which combines Merge Sort with Insertion Sort. # 2. Merge Sort in C++ This section explains how the "divide and conquer" strategy of Merge Sort is implemented in C++. The logic is split into two primary functions: `mergeSort()` which handles the recursive division, and `merge()` which handles the conquering and sorting. ### The C++ Code Implementation Here is the complete code for a Merge Sort implementation using `std::vector`. ```cpp #include #include // Utility function to print a vector void printVector(const std::vector& arr) { for (int num : arr) { std::cout << num << " "; } std::cout << std::endl; } // Merges two sorted subarrays into a single sorted subarray. // First subarray is arr[left..mid] // Second subarray is arr[mid+1..right] void merge(std::vector& arr, int left, int mid, int right) { // Calculate sizes of the two temporary subarrays int n1 = mid - left + 1; int n2 = right - mid; // Create temporary vectors to hold the data std::vector L(n1), R(n2); // Copy data from the main array to the temporary vectors for (int i = 0; i < n1; i++) L[i] = arr[left + i]; for (int j = 0; j < n2; j++) R[j] = arr[mid + 1 + j]; // Merge the temporary vectors back into the original array arr[left..right] int i = 0; // Initial index for left subarray int j = 0; // Initial index for right subarray int k = left; // Initial index for the merged subarray while (i < n1 && j < n2) { if (L[i] <= R[j]) { arr[k] = L[i]; i++; } else { arr[k] = R[j]; j++; } k++; } // Copy any remaining elements from the left subarray while (i < n1) { arr[k] = L[i]; i++; k++; } // Copy any remaining elements from the right subarray while (j < n2) { arr[k] = R[j]; j++; k++; } } // The main recursive function that implements the "divide" phase void mergeSort(std::vector& arr, int left, int right) { // Base case: if the array has 1 or 0 elements, it's already sorted if (left >= right) { return; } // Find the middle point to divide the array int mid = left + (right - left) / 2; // Recursively call mergeSort for the two halves mergeSort(arr, left, mid); mergeSort(arr, mid + 1, right); // Merge the two now-sorted halves merge(arr, left, mid, right); } // Main driver function int main() { std::vector data = {6, 5, 3, 1, 8, 7, 2, 4}; std::cout << "Original array: "; printVector(data); mergeSort(data, 0, data.size() - 1); std::cout << "Sorted array: "; printVector(data); return 0; } ``` ### Output ``` Original array: 6 5 3 1 8 7 2 4 Sorted array: 1 2 3 4 5 6 7 8 ``` ### Code Breakdown --- ### Code Breakdown #### The `mergeSort()` Function: The Divider This function is the engine that drives the **"Divide"** phase. 1. **Base Case**: The recursion stops when `left >= right`, which means the sub-array has one or zero elements and is considered sorted. 2. **Divide**: It calculates the `mid` point of the current array segment. The formula `left + (right - left) / 2` is used to prevent potential overflow on very large arrays. 3. **Recurse**: It calls itself for the left half (`left` to `mid`) and then for the right half (`mid + 1` to `right`). 4. **Conquer**: After the recursive calls return (guaranteeing the two halves are sorted), it calls `merge()` to combine them into a single, sorted segment. #### The `merge()` Function: The Conqueror This function is the core of the algorithm where the actual sorting happens. It takes two adjacent, sorted sub-arrays and merges them. 1. **Setup**: It creates two temporary vectors, `L` and `R`, and copies the data from the two halves of the main array into them. 2. **Merge Loop**: It uses three index pointers: `i` for vector `L`, `j` for vector `R`, and `k` for the original array `arr`. It compares the elements at `L[i]` and `R[j]` and places the smaller of the two into `arr[k]`. It then increments the pointer of the vector from which the element was taken, as well as the pointer `k`. 3. **Copy Leftovers**: After the main loop, one of the temporary vectors might still contain elements. The final two `while` loops copy these remaining elements back into the main array, ensuring all data is merged correctly. # 3. Understanding Quick Sort ### What is Quick Sort? **Quick Sort** is a highly efficient, comparison-based sorting algorithm that also uses a **"divide and conquer"** strategy. It is one of the most widely used sorting algorithms due to its fast average-case performance. The core idea is to select a 'pivot' element from the array and partition the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The sub-arrays are then sorted recursively. --- ### How Does it Work? The process can be broken down into three main steps: 1. **Pivot Selection:** Choose an element from the array to be the pivot. Common strategies include picking the first element, the last element, the middle element, or a random element. 2. **Partitioning:** Reorder the array so that all elements with values less than the pivot are on one side, and all elements with values greater than the pivot are on the other. This crucial step can be performed in several ways, with the two most common being: * **Lomuto Partition Scheme:** This is a very intuitive method. It typically uses the **last element** as the pivot. It then scans the array with a single pointer, maintaining a "wall" that separates smaller elements from the rest. After the scan, the pivot is swapped into its **final, sorted position**. This scheme is easy to implement but can be inefficient with already-sorted data. * **Hoare Partition Scheme:** This is the original method and is generally faster. It often uses the **first element** as the pivot. It uses **two pointers**—one at each end of the array—that move toward each other, swapping elements as they go. This method correctly separates the values, but the pivot itself does **not necessarily** end up in its final sorted position. 3. **Recursion:** Recursively apply the above steps to the sub-arrays on both sides of the partition. This continues until the base case of an array with zero or one element is reached, at which point the entire array is sorted. --- ### Visualization Explanation #### Lomuto Partition Scheme [![Lomuto_animated (1).gif](https://learn.digilabdte.com/uploads/images/gallery/2025-09/lomuto-animated-1.gif)](https://learn.digilabdte.com/uploads/images/gallery/2025-09/lomuto-animated-1.gif) The algorithm uses three key components, which are represented by colors in the animation: * The **Pivot** (blue bar): The value everything else is compared against. In this scheme, it's the last element. * The **Scanner** `j` (yellow highlight): This pointer moves through the array to inspect each element. * The **Wall** `i` (green highlight): This pointer marks the boundary for elements that are smaller than the pivot. ##### The Process Here's a step-by-step breakdown of the process shown in the GIF. ##### 1. The Setup The process begins by selecting the **last element** of the array as the **Pivot** (making it blue). The **Wall** (`i`) is placed just before the first element, and the **Scanner** (`j`) starts at the first element. ##### 2. The Partitioning Loop The **Scanner** (`j`) moves from left to right, one element at a time. For each element it inspects, one of two things happens: * **If the element is LARGER than the Pivot:** The element is already on the "correct" side of where the pivot will eventually be. The algorithm does nothing but advance the **Scanner** `j` to the next element. * **If the element is SMALLER than the Pivot:** This is the key action. The algorithm needs to move this smaller element to the left section of the array. It does this in two steps: 1. The **Wall** (`i`) is moved one position to the right to make room in the "smaller than pivot" section. 2. The smaller element at the **Scanner's** position (`j`) is **swapped** with the element now at the **Wall's** new position (`i`). This loop continues until the **Scanner** has inspected every element except for the pivot itself. ##### 3. Final Pivot Placement After the loop finishes, all elements smaller than the pivot are now grouped together on the left side of the **Wall**. The final step is to place the pivot in its correct sorted position. This is done by **swapping** the **Pivot** (the blue bar) with the element that is immediately to the right of the **Wall** (`i+1`). The result is a perfectly partitioned array. The pivot is now in its final place, with every element to its left being smaller and every element to its right being larger. This process is then repeated recursively on the sub-arrays to the left and right of the pivot. #### Hoare Partition Scheme [![quick_sort_hoare.gif](https://learn.digilabdte.com/uploads/images/gallery/2025-09/quick-sort-hoare.gif)](https://learn.digilabdte.com/uploads/images/gallery/2025-09/quick-sort-hoare.gif) ## How It Works: A Breakdown Here’s a breakdown of the process for any given array segment. ##### 1. The Setup * **Pivot Selection**: An element is chosen as the **pivot**. In this animation, it appears to be the first element of the segment being sorted. * **Two Pointers**: Two pointers are established: a **left pointer** `i` and a **right pointer** `j`, starting at opposite ends of the segment. ##### 2. The Partitioning Loop ↔️ This is the main action you see in the animation. The two pointers begin moving towards the center of the array to find elements that are on the wrong side of the pivot. 1. The **left pointer `i`** moves to the right, skipping over all elements that are smaller than the pivot. It stops when it finds an element **larger than or equal to** the pivot. 2. The **right pointer `j`** moves to the left, skipping over all elements that are larger than the pivot. It stops when it finds an element **smaller than or equal to** the pivot. 3. **The Swap**: Once both pointers have stopped, if they haven't crossed each other, the two elements they are pointing at are **swapped**. This places both elements on their correct side of the partition. This search-and-swap process repeats until the pointers cross. ##### 3. The Result When the pointers cross, the loop terminates. The array segment is now partitioned into two groups: * A left partition where all values are less than or equal to the pivot's value. * A right partition where all values are greater than or equal to the pivot's value. Crucially, unlike other methods, the pivot itself does **not** necessarily end up in its final sorted position. The algorithm then recursively repeats this entire process on the two new sub-arrays. --- ### Benefits of Quick Sort * **Fast on Average**: It has an average-case time complexity of **O(n log n)**, which is extremely fast in practice, often outperforming other sorting algorithms. * **In-place Sorting**: It has a space complexity of **O(log n)** on average because it sorts the array "in-place," meaning it doesn't require a separate auxiliary array like Merge Sort. * **Low Overhead**: The inner loop of the algorithm is very simple and can be implemented efficiently on most machine architectures. --- ### Drawbacks of Quick Sort * **Worst-Case Performance**: Its main weakness is a worst-case time complexity of **O(n^2)**. This occurs when the chosen pivots are consistently the smallest or largest elements, which can happen with already-sorted or reverse-sorted data. * **Unstable Sort**: Quick Sort is **unstable**. It does not guarantee that the relative order of equal elements will be preserved after sorting. * **Sensitive to Pivot Choice**: The efficiency of the algorithm is highly dependent on the pivot selection strategy. A # 4. Lomuto Quick Sort in C++ This section explains how the "divide and conquer" strategy of Quick Sort is implemented in C++. The logic is split into two primary functions: `quickSort()` which handles the recursive division, and `partition()` which rearranges the array using the popular **Lomuto partition scheme**. ### The C++ Code Implementation Here is the complete code for a Quick Sort implementation using `std::vector`. ```cpp #include #include #include // For std::swap // Utility function to print a vector void printVector(const std::vector& arr) { for (int num : arr) { std::cout << num << " "; } std::cout << std::endl; } // This function implements the Lomuto partition scheme. It takes the last // element as pivot and places it at its correct position in the sorted array. int partition(std::vector& arr, int low, int high) { // Choose the last element as the pivot (key to Lomuto) int pivot = arr[high]; // Index of the smaller element, acting as a "wall" int i = (low - 1); for (int j = low; j <= high - 1; j++) { // If the current element is smaller than the pivot if (arr[j] < pivot) { i++; // Move the wall to the right std::swap(arr[i], arr[j]); } } // Place the pivot in its correct position after the wall std::swap(arr[i + 1], arr[high]); return (i + 1); } // The main recursive function that implements Quick Sort void quickSort(std::vector& arr, int low, int high) { // Base case: if the array has 1 or 0 elements, it's already sorted if (low < high) { // pi is the partitioning index from the Lomuto partition int pi = partition(arr, low, high); // Separately sort elements before and after the partition quickSort(arr, low, pi - 1); quickSort(arr, pi + 1, high); } } // Main driver function int main() { std::vector data = {6, 5, 3, 1, 8, 7, 2, 4}; std::cout << "Original array: "; printVector(data); quickSort(data, 0, data.size() - 1); std::cout << "Sorted array: "; printVector(data); return 0; } ``` ### Output ``` Original array: 6 5 3 1 8 7 2 4 Sorted array: 1 2 3 4 5 6 7 8 ``` ### Code Breakdown #### The `quickSort()` Function: The Recursive Driver This function orchestrates the **"Divide"** phase of the algorithm. 1. **Base Case**: The recursion continues as long as `low < high`. If `low >= high`, it means the sub-array has one or zero elements, which is inherently sorted. 2. **Partition**: It calls the `partition()` function. This function rearranges the sub-array and returns the index `pi` where the pivot element is now correctly placed. 3. **Recurse**: It calls itself twice for the two sub-arrays created by the partition: one for the elements to the left of the pivot (`low` to `pi - 1`) and one for the elements to the right (`pi + 1` to `high`). Because this uses the Lomuto scheme, the pivot at `pi` is guaranteed to be in its final place and can be excluded from recursion. #### The `partition()` Function (Lomuto Scheme) This function implements the well-known **Lomuto partition scheme**. It is where the key logic of rearranging elements occurs and can be seen as the **"Conquer"** step, as it correctly places one element (the pivot) at a time. 1. **Pivot Selection**: It selects the **last element** of the sub-array (`arr[high]`) as the **pivot**. This is the defining starting point for this version of Lomuto. 2. **Rearranging Loop**: It maintains an index `i` (the "wall") which represents the boundary for elements smaller than the pivot. It iterates through the sub-array with another index `j` (the "scanner"). If it finds an element `arr[j]` that is smaller than the pivot, it moves the wall (`i++`) and swaps `arr[j]` with the element at the new wall position, effectively expanding the "smaller than pivot" section. 3. **Final Pivot Placement**: After the loop finishes, all elements smaller than the pivot are at the beginning of the sub-array. The pivot (still at `arr[high]`) is then swapped with the element at `arr[i + 1]`. This places the pivot exactly where it belongs in the sorted array. 4. **Return Index**: The function returns the new index of the pivot (`i + 1`), so the `quickSort` function knows where to split the array for its next recursive calls. # 5. Hoare Quick Sort in C++ This section explains how the "divide and conquer" strategy of Quick Sort is implemented in C++. The logic is split into two primary functions: `quickSort()` which handles the recursive division, and `partition()` which rearranges the array using the classic **Hoare partition scheme**. ### The C++ Code Implementation Here is the complete code for a Quick Sort implementation using the Hoare partition scheme. ```cpp #include #include #include // For std::swap // Utility function to print a vector void printVector(const std::vector& arr) { for (int num : arr) { std::cout << num << " "; } std::cout << std::endl; } // This function implements the Hoare partition scheme. It partitions the // array and returns the split point. int partition(std::vector& arr, int low, int high) { // Choose the first element as the pivot (key to this Hoare implementation) int pivot = arr[low]; int i = low - 1; int j = high + 1; while (true) { // Find an element on the left side that should be on the right do { i++; } while (arr[i] < pivot); // Find an element on the right side that should be on the left do { j--; } while (arr[j] > pivot); // If the pointers have crossed, the partition is done if (i >= j) { return j; } // Swap the elements to place them on the correct side std::swap(arr[i], arr[j]); } } // The main recursive function that implements Quick Sort void quickSort(std::vector& arr, int low, int high) { // Base case: if the array has 1 or 0 elements, it's already sorted if (low < high) { // pi is the split point from the Hoare partition int pi = partition(arr, low, high); // Recursively sort the two partitions quickSort(arr, low, pi); quickSort(arr, pi + 1, high); } } // Main driver function int main() { std::vector data = {6, 5, 3, 1, 8, 7, 2, 4}; std::cout << "Original array: "; printVector(data); quickSort(data, 0, data.size() - 1); std::cout << "Sorted array: "; printVector(data); return 0; } ``` Output ``` Original array: 6 5 3 1 8 7 2 4 Sorted array: 1 2 3 4 5 6 7 8 ``` ### Code Breakdown #### The `quickSort()` Function: The Recursive Driver This function orchestrates the **"Divide"** phase of the algorithm. 1. **Base Case**: The recursion continues as long as `low < high`. If `low >= high`, it means the sub-array has one or zero elements. 2. **Partition**: It calls the `partition()` function, which rearranges the sub-array and returns a **split point index** `pi`. 3. **Recurse**: It calls itself on the two partitions. Because the Hoare scheme's split point `pi` is not guaranteed to be the pivot's final position, the recursive calls are `quickSort(arr, low, pi)` and `quickSort(arr, pi + 1, high)` to ensure all elements are included in the sorting process. #### The `partition()` Function (Hoare Scheme) This function implements the classic **Hoare partition scheme**. It is generally more efficient than Lomuto because it performs fewer swaps on average. 1. **Pivot Selection**: It selects the **first element** (`arr[low]`) as the pivot. 2. **Two Pointers**: It uses two pointers, `i` and `j`, which start at opposite ends of the array and move toward each other. 3. **Rearranging Loop**: The `i` pointer moves right to find the first element greater than or equal to the pivot, while the `j` pointer moves left to find the first element less than or equal to the pivot. If the pointers haven't crossed, the two elements are swapped. 4. **Return Index**: The loop terminates when the pointers cross. The function returns the final position of the `j` pointer, which serves as the split point for the two partitions. Unlike Lomuto, the pivot is **not** guaranteed to be at this returned index. # Module 5 - Advanced Sorting Algorithms & The Heap Data Structure # 1. Motivation: The Quest for the Ideal Sorting Algorithm In our study of advanced sorting algorithms, we have explored powerful techniques that offer significant performance improvements over basic O(n²) methods. However, these advanced algorithms often come with their own set of trade-offs. Let's recap two of the most prominent examples: Merge Sort and Quick Sort. #### **A Recap of Trade-offs** **Merge Sort** - **Strength:** Its greatest advantage is its **guaranteed Θ(n log n) performance**. This efficiency holds true for all cases—worst, average, and best—making it exceptionally reliable and predictable. - **Weakness:** It is not an in-place algorithm. Merge Sort requires auxiliary memory proportional to the size of the input array, giving it a space complexity of **O(n)**. This can be a significant drawback when memory is limited. **Quick Sort** - **Strength:** It is an **in-place** algorithm, requiring only a small, logarithmic amount of space on the recursion stack (**O(log n)**). In the average case, it is often faster in practice than other O(n log n) algorithms due to its low constant factors. - **Weakness:** Its primary disadvantage is its worst-case performance. If the pivot selection is poor (e.g., on an already sorted array), its performance degrades significantly to a slow **Θ(n²)**, which is no better than basic sorting methods like Bubble Sort. #### **The Key Question** This analysis of trade-offs leads to a crucial question: **Is there an algorithm that offers the "best of both worlds"?** Can we find a sorting method that combines: 1. The guaranteed **Θ(n log n)** worst-case performance of Merge Sort. 2. The **O(1)** space efficiency (in-place nature) of Quick Sort. #### **The Answer: Heap Sort** The answer is **yes**, and one of the most classic algorithms that achieves this powerful combination is **Heap Sort**. It stands as a testament to how the right choice of an underlying data structure can lead to an algorithm with an excellent performance profile. To fully understand how Heap Sort achieves this, we must first dive into the data structure that powers it: the **Heap**. The following sub-modules will build our understanding from the ground up, starting with the basic concepts of trees and leading to the full implementation and analysis of Heap Sort. # 2. Prerequisite: An Introduction to the Tree Data Structure Before we can understand the Heap, we must first be familiar with its underlying structure: the Tree. In computer science, a **tree** is a widely used data structure that simulates a hierarchical structure, with a set of connected nodes. #### **Core Components** Every tree is composed of a few fundamental components: - **Node:** The primary entity in a tree that contains data. It is also known as a "vertex". - **Edge:** A connection or link between two nodes. - **Root:** The topmost node in a tree. It is the only node that does not have a parent. Consider the tree structure below: - Nodes: {A, B, C, D, E} - Edges: {(A-B), (A-C), (B-D), (B-E)} - Root: Node A #### **Hierarchical Terminology** The relationships between nodes in a tree are described using terminology borrowed from family trees: - **Parent:** A node that is directly above another node and connected to it. In our example, A is the parent of B and C. - **Child:** A node that is directly below another node and connected to it. In our example, D and E are children of B. - **Leaf:** A node that has no children. In our example, C, D, and E are leaf nodes. #### **Structure and Depth** We can measure the structure of a tree in several ways: - **Level:** The level of a node is defined by its distance from the root. The root is at **Level 0**. Its children are at Level 1, their children are at Level 2, and so on. - **Depth/Height:** The height of a tree is the number of edges on the longest path from the root to a leaf. Our example tree has a height of 2. #### **Focus on the Binary Tree** While a node in a tree can have any number of children, our focus for understanding heaps will be on a specific type: the **Binary Tree**. A **Binary Tree** is a tree data structure in which each node has **at most two children**, which are referred to as the left child and the right child. The reason we focus on Binary Trees is that the **Heap** data structure, which is the engine of Heap Sort, is a specialized type of Binary Tree. Mastering this concept is a fundamental step toward understanding heaps. # 3. The Heap Data Structure Now that we understand the concept of a binary tree, we can define a Heap. A **Heap** is a specialized tree-based data structure that satisfies two specific properties. #### **The Two Defining Properties of a Heap** For a binary tree to be considered a Heap, it must adhere to the following rules: 1. **Structure Property: It must be an Essentially Complete Binary Tree.** This means that the tree is completely filled on all levels, with the possible exception of the last level, which must be filled from **left to right** without any gaps. This "no gaps" property is crucial, as it allows a heap to be stored efficiently in an array. 2. **Heap Property (Order Property):** The nodes must be ordered in a specific way relative to their children. This ordering defines the type of heap. #### **Types of Heaps** There are two main types of heaps, distinguished by the Heap Property they enforce: - **Max-Heap:** The value of every node is **greater than or equal to** the value of each of its children. - **Implication:** In a Max-Heap, the **largest element** in the entire collection is always located at the **root** of the tree. This is the type of heap used for Heap Sort. - Example: A tree with root 10 and children 8 and 9 is a valid Max-Heap at the top level. - **Min-Heap:** The value of every node is **less than or equal to** the value of each of its children. - **Implication:** In a Min-Heap, the **smallest element** is always located at the **root**. This structure is commonly used to implement Priority Queues. - Example: A tree with root 2 and children 5 and 3 is a valid Min-Heap at the top level. #### **The Array Representation** The complete binary tree property is precisely what makes an array a perfect and highly efficient way to store a heap. The hierarchical relationships are not stored with pointers, but are instead calculated mathematically based on an element's index. For an element at a **zero-based index i** in an array representing a heap: - The index of its **parent** is calculated as: floor((i - 1) / 2) - The index of its **left child** is calculated as: 2 \* i + 1 - The index of its **right child** is calculated as: 2 \* i + 2 For example, for the node at index i = 3: - Its parent is at index floor((3 - 1) / 2) = floor(1) = 1. - Its left child would be at index 2 \* 3 + 1 = 7. # 4. Core Operations and the Heap Sort Algorithm The Heap Sort algorithm is a two-phase process that masterfully uses the properties of the Max-Heap. Both phases rely on a core "helper" operation that maintains the heap property. #### **The Engine of the Heap: The siftdown Operation** To build and maintain a heap, we need a procedure to fix any violations of the heap property. The primary operation used in Heap Sort is siftdown (also known as heapify-down). - **siftdown (or Heapify-Down):** This operation is used when a node's value is **smaller than** one of its children, violating the Max-Heap property. The siftdown process corrects this by repeatedly swapping the parent with its **largest child**, effectively "sinking" the smaller element down the tree until the heap property is restored for that subtree. The time complexity for a single siftdown operation is proportional to the height of the tree, making it **O(log n)**. - **siftup (or Heapify-Up):** While less critical for our Heap Sort implementation, this complementary operation is used when a newly inserted node is **larger than** its parent. It "bubbles" the element up the tree by swapping it with its parent until the heap property is restored. This is the key operation used when adding elements to a std::priority\_queue. With the siftdown operation as our main tool, we can now construct the two phases of Heap Sort. #### **Phase 1: makeheap (The Heapify Process)** The first step is to convert the unsorted input array into a valid Max-Heap. - **Goal:** Rearrange the elements of the array so that they satisfy the Max-Heap property. - **Method (Bottom-Up):** The most efficient way to do this is with a bottom-up approach. We treat the entire array as a complete binary tree and then iteratively fix it. The process starts from the **last non-leaf node** and moves upwards towards the root. For each node, we call siftdown to ensure its subtree is a valid Max-Heap. By the time we reach the root, the entire array is guaranteed to be a Max-Heap. - **Analysis:** While it involves multiple calls to siftdown (an O(log n) operation), a tight analysis shows that the makeheap phase can be completed in **O(n) linear time**, which is remarkably efficient. #### **Phase 2: The Sorting Process** Once the array is a Max-Heap, the largest element is at the root (array\[0\]). The sorting phase systematically extracts this largest element and places it in its correct final position. This is done through a repeated process: 1. **Swap:** Swap the root element (array\[0\], the current maximum) with the **last element** in the heap portion of the array. The largest element is now in its final, sorted position at the end of the array. 2. **Shrink:** The effective size of the heap is reduced by one, "locking in" the sorted element at the end so it is no longer considered part of the heap. 3. **Repair:** The new root element (which was previously the last element) likely violates the Max-Heap property. Call siftdown on the root (array\[0\]) to repair the heap, ensuring the next largest element rises to the top. This cycle is repeated n-1 times, until the entire array is sorted. #### **Complexity Analysis of Heap Sort** - **Time Complexity:** - Phase 1 (makeheap): **O(n)** - Phase 2 (Sorting): Consists of n-1 calls to siftdown, each taking **O(log n)** time. Total time for this phase is **O(n log n)**. - The overall complexity is dominated by the sorting phase, giving Heap Sort a guaranteed **Θ(n log n)** time complexity for worst-case, average-case, and best-case scenarios. - **Space Complexity:** - The algorithm operates directly on the input array, swapping elements within it. It requires no significant extra storage. Therefore, Heap Sort is an **in-place** algorithm with a space complexity of **O(1)**. # 5. The Heap in Practice: The C++ Standard Template Library (STL) While understanding how to build Heap Sort from scratch is crucial for algorithmic knowledge, in modern C++, we often leverage the powerful abstractions provided by the Standard Template Library (STL). The concepts of a heap are primarily exposed through std::priority\_queue. #### **std::priority\_queue: A Ready-to-Use Heap** The C++ STL provides a container adapter called std::priority\_queue that is an implementation of a Heap. - **Default Behavior:** By default, std::priority\_queue<int> is a **Max-Heap**. This means that when you call top(), it returns the largest element, and when you pop(), it removes the largest element while maintaining the heap property. - **Core Operations:** - push(): Adds an element to the queue (heap). This internally performs a sift-up operation. Complexity: **O(log n)**. - pop(): Removes the top element. Complexity: **O(log n)**. - top(): Returns a reference to the top element without removing it. Complexity: **O(1)**. #### **Working with Custom Data Types** What happens if we want to create a priority queue of custom objects, like a struct for patients in a hospital? ```c++ struct Patient { std::string name; int triage_level; // Level 1 is highest priority }; // This will cause a compile error! std::priority_queue er_queue; ``` The compiler doesn't know how to compare two Patient objects. To solve this, we must provide our own custom comparison logic. #### **Custom Comparators: Functors and Lambda Expressions** We can tell std::priority\_queue how to order our custom types using a **custom comparator**. There are two common ways to do this in modern C++: 1. **Functor (Function Object):** A struct or class that overloads the function call operator operator(). The priority\_queue will create an object of this type and use its operator() to compare elements. This is a powerful, stateful way to define comparison logic. ```c++ struct ComparePatients { bool operator()(const Patient& a, const Patient& b) { // Because priority_queue is a Max-Heap, it puts the "larger" // element on top. We want level 1 to be the highest priority, // so we tell the queue that 'a' is "less than" 'b' if its // triage level is numerically greater. return a.triage_level > b.triage_level; } }; // Usage: std::priority_queue, ComparePatients> er_queue; ``` 2. **Lambda Expression:** An inline, anonymous function that can be defined on the spot. Lambdas are often more concise and readable for simple, stateless comparison logic. They are commonly used with algorithms like std::sort. ```c++ auto compare_lambda = [](const Patient& a, const Patient& b) { return a.triage_level > b.triage_level; }; // Usage with priority_queue is slightly more verbose std::priority_queue, decltype(compare_lambda)> er_queue(compare_lambda); ``` By providing these comparators, we can leverage the highly optimized and safe implementation of a heap provided by the STL for any data type we need. # Module 6 - Tree # 1. Introduction # Module 6: Tree #### **Author:** YP ## **Learning Objectives** After completing this module, students are expected to be able to: 1. Explain the basic concept of a **Tree** as a non-linear data structure. 2. Identify the differences between **root, parent, child, leaf, and subtree**. 3. Implement **Traversal** methods: Inorder, Preorder, Postorder, and Level Order. 4. Understand the role of **Stack** and **Queue** in traversal algorithms. 5. Apply the concept of Trees to **real-world cases in daily life**. # 2. Tree Concept ## **2.1 Basic Theory** ### 2.1.1 Definition of a Tree ![image](https://hackmd.io/_uploads/B1FBSkOsxl.png) A **Tree** is a **non-linear hierarchical** data structure composed of nodes and edges. - Every Tree has a **root** (root node). - The root can have **children**. - A node with no children is called a **leaf**. - Each node and its descendants can form a **subtree**. ### 2.1.2 Important Terms in a Tree ![image](https://hackmd.io/_uploads/S1UsSy_iel.png) - **Root** → the topmost node. - **Parent** → a node that has children. - **Child** → a node derived from a parent. - **Leaf** → a node without children. - **Subtree** → a small tree formed from a node and its descendants. ### 2.1.3 Binary Tree ![image](https://hackmd.io/_uploads/Bkc9h-7Fge.png) A **Binary Tree** is a tree data structure in which each node has at most two children (left and right). - **There are no special rules** for placing node values. - **Used** for hierarchical representation, data structures like heaps, expression trees, and implementing search or traversal algorithms. - Nodes in a Binary Tree can have 0, 1, or 2 children. **Characteristics of a Binary Tree:** - Height of the tree: the number of levels from the root to the farthest leaf. - Leaf node: a node that has no children. --- ## 2.2 Introduction to Stack and Queue ### **2.2.1 Stack** ![image](https://hackmd.io/_uploads/HJWpi1_sll.png) A LIFO (Last In First Out) data structure → the last data in is the first one out. Example: a stack of plates in a cafeteria. The last plate placed on top will be the first one taken. Main operations: - push(x) → adds data to the top of the stack. - pop() → removes data from the top of the stack. **Application in Trees** Used in DFS (Depth First Search) or Preorder, Inorder, Postorder traversals. When we enter a child node, the current node is stored on the stack. After finishing with the child, we pop from the stack to continue. ### **2.2.2 Queue** ![image](https://hackmd.io/_uploads/B1vhoydsxe.png) A FIFO (First In First Out) data structure → the first data in is the first one out. Example: a minimarket cashier line, the person who arrives first will be served first. Main operations: - enqueue(x) → inserts data at the back of the queue. - dequeue() → removes data from the front of the queue. **Application in Trees** Used in BFS (Breadth First Search) or Level Order traversal. When visiting a node, its children are added to the queue, then processed one by one in order. --- ## 2.3 Traversal in Trees Traversal is the process of visiting each node in a tree. ### **2.3.1 Inorder (Left, Root, Right)** - visit the left, then the middle (root), then the right. The result is usually data sorted in ascending order. ```cpp= // Inorder (Left - Root - Right) void inorder(Node* root) { if (root == nullptr) return; inorder(root->left); cout << root->data << " "; inorder(root->right); } ``` ### **2.3.2 Preorder (Root, Left, Right)** - visit the middle first, then the left, then the right. Used to represent the structure of the tree. ```cpp= // Preorder (Root - Left - Right) void preorder(Node* root) { if (root == nullptr) return; cout << root->data << " "; preorder(root->left); preorder(root->right); } ``` ### **2.3.3 Postorder (Left, Right, Root)** - visit the left first, then the right, finally the middle. Suitable if you want to delete all contents of the tree. ```cpp= // Postorder (Left - Right - Root) void postorder(Node* root) { if (root == nullptr) return; postorder(root->left); postorder(root->right); cout << root->data << " "; } ``` ### **2.3.4 Level Order** - Visit nodes level by level from top to bottom, left to right. - Usually uses a queue. ```cpp void levelOrder(Node* root) { if (root == nullptr) return; queue q; q.push(root); while (!q.empty()) { Node* temp = q.front(); q.pop(); cout << temp->data << " "; if (temp->left) q.push(temp->left); if (temp->right) q.push(temp->right); } } ``` ### **2.3.5 Traversal Illustration Example** ``` 50 / \ / \ 30 70 / \ / \ 20 40 60 80 ``` - **Inorder:** 20, 30, 40, 50, 60, 70, 80 - **Preorder:** 50, 30, 20, 40, 70, 60, 80 - **Postorder:** 20, 40, 30, 60, 80, 70, 50 - **Level Order:** 50, 30, 70, 20, 40, 60, 80 --- ## **References** - GeeksforGeeks, “Binary Search Tree (BST),” [Online]. Available: https://www.geeksforgeeks.org/binary-search-tree-data-structure/. [Accessed: 20-August-2025] - Programiz, “Tree Data Structure in C++,” [Online]. Available: https://www.programiz.com/dsa/binary-search-tree. [Accessed: 20-August-2025] # Module 7 - Hash Map # 1. Main Concept of Hash Map **Hashing** is the process of converting data of any size (like a string, number, or object) into a fixed-length integer value. This integer value is called a **"hash value,"** "hash code," or simply "hash." The primary data structure that uses this concept is called a **Hash Table**. ## 1.1. Components of a Hash System A hash-based search system consists of three main components: - **Hash Table:** This is fundamentally an array (or a `std::vector`). Each "row" or "slot" in this array is called a **bucket**. The size of this table (`m`) is a key factor in its performance. - **Key:** This is the data you want to store or look up. The key is fed into the hash function to find its proper location. For example, in a phone book, the ***name*** is the key. - **Hash Function:** This is the "engine" or mathematical function that takes your **key** and computes an **index** (a row number from `0` to `m-1`) within the array where your data should be stored. The main goal of hashing is to achieve extremely fast data access. In the ideal case, the hash function provides a unique index for every key, allowing insertion, search, and deletion operations to be performed in constant time, or **`O(1)`**. #### Hash Map Analogy: Imagine a large file cabinet (**Hash Table**) with 100 drawers numbered 0-99. To store a document for "Budi," you don't search one by one. You use a rule (**Hash Function**), for example, "Take the first letter ('B' -> 2), store it in drawer #2." When you need to find "Budi," you compute the same rule and immediately jump to drawer #2. ## 1.2. Good Hash Function Properties The **Hash Function** is the most critical part of the Hash Table. A good function must have the following properties: - **Deterministic:** The same input (key) must *always* produce the same output (hash value). - **Efficient:** The hash function must be fast to compute (ideally `O(1)`). - **Uniform Distribution:** This is the most important. The hash function must spread keys as evenly as possible across all slots (indices) in the table. This minimizes collisions. If the hash function doesn't distribute data well (e.g., it hashes all names starting with 'A', 'B', and 'C' to the same slot), it will cause **collisions**, and the performance will degrade significantly. ## 1.3. Common Hash Function Methods There are several methods to design a hash function, each with different properties. The goal is always to create a **"well-distributed" hash** that avoids clustering keys in the same buckets. ### 1.3.1. Division Method This is the most common and simplest method. It takes the key (which must be a numeric value or convertible to one) and finds the remainder after dividing by the table size. - **Formula:** `h(key) = key % tableSize` - **Pros:** It is extremely fast, involving only a single modular arithmetic operation. - **Cons:** The performance is highly dependent on the choice of `tableSize`. If `tableSize` is poorly chosen (e.g., a power of 10 or 2), and the input keys have a pattern (e.g., all keys are even), this can lead to massive collisions. This is why it is strongly recommended to make `tableSize` a **prime number** that is not close to a power of 2 ### 1.3.2. Multiplication Method This method is a popular choice because it is far less sensitive to the `tableSize` (which can be any value, often a power of 2 for efficiency). - **Formula:** `h(key) = floor( tableSize * ( (key * A) % 1 ) )` - **Explanation:** - **`key * A`:** The key is multiplied by a constant A (where 0 < A < 1). A common and effective value for A is related to the golden ratio. - **`( ... ) % 1:`** This operation extracts only the fractional part of the result. This fractional part is highly mixed and influenced by all digits of the key. - **`tableSize * ( ... )`:** This fractional part is then scaled up to the range of the table. - **`floor( ... )`:** The integer part is taken as the final index. - **Pros:** This method effectively scrambles the input bits, meaning that similar keys (like 100 and 101) are likely to be mapped to very different indices. The choice of `tableSize` is not critical. ### 1.3.3. Mid-Square Method This method is particularly effective at breaking up patterns in keys that are sequential or have similar prefixes/suffixes. - **Concept:** The key is first squared. The resulting number is often very large. A set of digits is then extracted from the middle of this squared result to be used as the hash index. - **Why it works:** The middle digits of a squared number are influenced by all the digits of the original key. Changes in either the beginning or end of the key will cause significant changes in the middle digits, effectively "folding" the key in on itself and scrambling any existing patterns. - **Pros:** Good at breaking up non-random sequences in input keys. ### 1.3.4. Folding Method This method is excellent for hashing keys that are very long, such as account numbers, ISBNs, or long strings, where other methods might lead to overflow or only use a portion of the key. - **Concept:** The key is divided into several equally-sized parts. These parts are then combined using a simple operation. - **Operations:** - **Addition:** All parts are added together. The final result may need an extra modulo (`% tableSize`) if it exceeds the table size. - **XOR:** All parts are combined using the bitwise XOR operation. This is very fast and naturally stays within bounds if the parts are sized correctly. - **Pros:** It ensures that all parts of a long key (beginning, middle, and end) contribute to the final hash value, making it highly distributive for long, structured keys. # 2. Collision Handling A **Collision** occurs when two or more different keys produce **the same hash value** (index). For example, "Budi" and "Dina" are both hashed to **row #5**. We can't overwrite Budi's data. We must have a strategy to handle this. ## 2.1. Separate Chaining ![](https://learn.digilabdte.com/uploads/images/gallery/2025-11/image-1762272337304.png) This is the most common strategy to avoid collisions. * **Concept:** Instead of each table row storing a single value, each row stores a pointer to another data structure, usually a **Linked List**. * **How it Works:** 1. "Budi" hashes to row 5. We create a linked list at row 5 and add "Budi". 2. "Dina" hashes to row 5. We add "Dina" to the linked list that already exists at row 5. 3. Row 5 now contains: `[Budi] -> [Dina] -> NULL` * **Search:** To find "Dina," we compute its hash (5), go to row 5, and then traverse the linked list at that row until we find "Dina." The worst-case search time becomes `O(k)` where `k` is the number of elements in the chain. ### 2.1.1. Manual Implementation: Insertion The `insert` function handles adding a new key-value pair. Its logic is: 1. Hash the `key` to find the correct bucket `index`. 2. Get the linked list (the chain) at that `index`. 3. Traverse the list: * If the `key` is already in the list, **update** its `value` and stop. * If the end of the list is reached and the `key` was not found, **add** the new key-value pair to the end of the list. ```cpp void insert(std::string key, int value) { // Hash the key to get the bucket index int index = hashFunction(key); // Get the chain (linked list) at that index std::list& chain = table[index]; // Traverse the chain for (auto& pair : chain) { if (pair.key == key) { // Key found: update the value and return pair.value = value; return; } } // Key not found: add new pair to the end of the list chain.push_back(KeyValuePair(key, value)); } ``` ### 2.1.2. Manual Implementation: Searching The `search` function finds the value associated with a key. Its logic is: 1. Hash the `key` to find the bucket `index`. 2. Get the linked list at that `index`. 3. Traverse the list: * If the `key` is found, return its `value`. * If the end of the list is reached and the `key` was not found, return a sentinel value (like -1) or throw an exception. ```cpp int search(std::string key) { // Hash the key to get the bucket index int index = hashFunction(key); // Get the chain at that index std::list& chain = table[index]; // Traverse the chain for (auto& pair : chain) { if (pair.key == key) { // Key found: return the value return pair.value; } } // Key not found return -1; } ``` ### 2.1.3. Manual Implementation: Deletion The remove function deletes a key-value pair. Its logic is: 1. Hash the `key` to find the bucket `index`. 2. Get the linked list at that `index`. 3. Traverse the list using an **iterator**. We must use an iterator because we need to know the position of the element to delete it. 4. If the `key` is found: * Call the list's `erase()` method, passing the iterator. * Stop and return. ```cpp void remove(std::string key) { // Hash the key to get the bucket index int index = hashFunction(key); // Get a reference to the chain auto& chain = table[index]; // Traverse the chain using an iterator for (auto it = chain.begin(); it != chain.end(); ++it) { // 'it' is an iterator (position pointer) // 'it->key' accesses the key of the element at that position if (it->key == key) { // Key found: erase the element at this position chain.erase(it); return; } } // If loop finishes, key was not found. Do nothing. } ``` ## 2.2. Open Addressing (Probing) This is an alternative strategy where all data is stored within the table itself. No linked lists are used. - **Concept:** If a collision occurs (the slot is full), we "probe" for the next empty slot according to a specific rule and place the item there. - **Types of Probing:** 1. **Linear Probing:** This is the simplest rule. If slot `h(key)` is full, try `h(key) + 1`, then `h(key) + 2`, `h(key) + 3`, and so on, wrapping around the table if necessary. The **problem** of this type is it tends to **create clusters of occupied slots**, which degrades performance as the table fills up. 2. **Quadratic Probing:** If slot `h(key)` is full, try `h(key) + 1^2`, then `h(key) + 2^2`, `h(key) + 3^2`, and so on. This helps spread out elements better than linear probing. 3. **Double Hashing:** Uses a second hash function to determine the "step size" for probing, which is the most effective at avoiding clusters. # 3. Load Factor and Rehashing ## 3.1. Load Factor (λ) The **Load Factor (λ)** is a measure of how full the hash table is. It is a critical metric for performance. - **Formula:** `λ = m/n` - `n` = total number of items stored in the table. - `m` = total size of the hash table (number of buckets). - **Impact on Performance:** - **Separate Chaining:** λ can be greater than 1. A higher λ means longer linked lists, and search time degrades towards `O(λ)` or `O(n)` in the worst case. - **Open Addressing:** λ cannot be greater than 1. As λ approaches 1, it becomes extremely difficult and slow to find an empty slot, and performance grinds to a halt ## 3.2. Rehashing To maintain good performance (close to `O(1)`), we must keep the load factor low. When the load factor exceeds a certain threshold (e.g., **λ > 0.75**), the table is "rehashed." **Rehashing** is the process of creating a new, larger hash table and re-inserting all existing elements into it. Here is the **process** of rehashing: - **Trigger:** The load factor exceeds a predefined threshold (e.g., 0.75). - **Allocate:** Create a new, larger hash table (e.g., `2 * m`, or the next prime number). - **Re-insert:** Iterate through **every element** in the old table. - **Re-hash**: For each element, compute its **new hash value** based on the **new, larger table size**. - **Insert**: Place the element in its new correct slot in the new table. - **Deallocate:** Free the memory of the old table. Rehashing is an `O(n)` operation. It is slow and computationally expensive, but it happens infrequently. Its cost is "amortized" over many `O(1)` insertions, so the average insertion time remains `O(1)`. ## 3.3 Rehashing Implementation ```cpp // We must track item count 'n' and table size 'm' int n = 0; // Number of items int m = 10; // Initial table size std::vector> table; const float MAX_LOAD_FACTOR = 0.75; void rehash() { // Get old table and size std::vector> oldTable = table; int oldSize = m; // Allocate new, larger table m = m * 2; table.clear(); table.resize(m); n = 0; // Reset item count // Re-insert all elements from old table for (int i = 0; i < oldSize; ++i) { for (auto& pair : oldTable[i]) { // Re-hash and insert into new table insert(pair.key, pair.value); } } } void insert(std::string key, int value) { // Check load factor before inserting if ((float)n / m > MAX_LOAD_FACTOR) { rehash(); } // Hash the key (uses the new 'm' if rehashed) int index = hashFunction(key); // Traverse chain to find or update for (auto& pair : table[index]) { if (pair.key == key) { pair.value = value; return; } } // Not found, add new pair and increment item count table[index].push_back(KeyValuePair(key, value)); n++; } ``` # 4. Example: Full Manual Implementation This chapter combines all the concepts from **Chapters 2 and 3** into a single, complete `MyHashMap` class. This class handles insertion, searching, deletion, and automatic rehashing. ### 4.1. Manual Implementation Using Separate Chaining ```cpp #include #include #include #include // Data Node: Struct to store Key-Value pairs struct KeyValuePair { std::string key; int value; KeyValuePair(std::string k, int v) : key(k), value(v) {} }; class MyHashMap { private: int n; // Number of items int m; // Number of buckets (table size) std::vector> table; const float MAX_LOAD_FACTOR = 0.75; // Hash Function (Simple Division Method) int hashFunction(std::string key) { int hash = 0; // A simple hash: sum ASCII values for (char c : key) { hash = (hash + (int)c); } return hash % m; // Use current table size 'm' } // Rehashing Function void rehash() { // Get old table and size std::vector> oldTable = table; int oldSize = m; // Allocate new, larger table m = m * 2; table.clear(); table.resize(m); n = 0; // Reset item count std::cout << ". New size: " << m << std::endl; // Re-insert all elements from old table for (int i = 0; i < oldSize; ++i) { for (auto& pair : oldTable[i]) { // Re-hash and insert into new table insert(pair.key, pair.value); } } } public: // Constructor: Initialize table MyHashMap(int size = 10) { // Default size of 10 n = 0; m = size; table.resize(m); } // Insert Function (with rehashing) void insert(std::string key, int value) { // Check load factor before inserting if ((float)(n + 1) / m > MAX_LOAD_FACTOR) { rehash(); } // Hash the key (uses the new 'm' if rehashed) int index = hashFunction(key); // Traverse chain to find or update for (auto& pair : table[index]) { if (pair.key == key) { pair.value = value; return; } } // Not found, add new pair and increment item count table[index].push_back(KeyValuePair(key, value)); n++; } // Search Function int search(std::string key) { int index = hashFunction(key); for (auto& pair : table[index]) { if (pair.key == key) { return pair.value; } } return -1; // Not found } // Remove Function void remove(std::string key) { int index = hashFunction(key); auto& chain = table[index]; for (auto it = chain.begin(); it != chain.end(); ++it) { if (it->key == key) { chain.erase(it); // Remove element at position 'it' n--; // Decrement item count return; } } } }; ``` ### 4.2. Manual Implementation Using Open Addressing ```cpp #include #include #include // Define the state for each slot enum class SlotState { EMPTY, // The slot has never been used OCCUPIED, // The slot contains an active key-value pair DELETED // The slot used to have data, but it was removed }; struct HashSlot { std::string key; int value; SlotState state; // Constructor to initialize new slots as EMPTY HashSlot() : key(""), value(0), state(SlotState::EMPTY) {} }; class MyHashMap { private: int n; // Number of items int m; // Number of buckets (table size) std::vector table; const float MAX_LOAD_FACTOR = 0.75; // Hash Function (Simple Division Method) int hashFunction(std::string key) { int hash = 0; for (char c : key) { hash = (hash + (int)c); } // Ensure hash is positive before modulo return (hash & 0x7FFFFFFF) % m; } // Rehashing Function void rehash() { std::cout << "[Info] Rehashing triggered. Old size: " << m; // Save the old table std::vector oldTable = table; int oldSize = m; // Create a new, larger table. m = m * 2; table.clear(); table.resize(m); n = 0; // Reset item count std::cout << ". New size: " << m << std::endl; // 6. Re-insert all active elements from the old table for (int i = 0; i < oldSize; ++i) { // Only re-insert items that are currently occupied if (oldTable[i].state == SlotState::OCCUPIED) { insert(oldTable[i].key, oldTable[i].value); } } } public: // Constructor: Initialize table MyHashMap(int size = 10) { // Default size of 10 n = 0; m = size; table.resize(m); } // Insert Function void insert(std::string key, int value) { // Check load factor before inserting if ((float)(n + 1) / m > MAX_LOAD_FACTOR) { rehash(); } // Get the initial hash index int index = hashFunction(key); int firstDeletedSlot = -1; // To store the index of the first DELETED slot // Start probing (loop max 'm' times) for (int i = 0; i < m; ++i) { int probedIndex = (index + i) % m; auto& slot = table[probedIndex]; // Get a reference to the slot // Case 1: Slot is OCCUPIED and the key matches if (slot.state == SlotState::OCCUPIED && slot.key == key) { // Key already exists, just update the value slot.value = value; return; } // Case 2: Slot is EMPTY if (slot.state == SlotState::EMPTY) { // We found an empty slot, so the key is not in the table. // We should insert it here, OR in the first DELETED slot we found. int insertIndex = (firstDeletedSlot != -1) ? firstDeletedSlot : probedIndex; table[insertIndex].key = key; table[insertIndex].value = value; table[insertIndex].state = SlotState::OCCUPIED; n++; // Increment item count return; } // Case 3: Slot is DELETED if (slot.state == SlotState::DELETED) { // We can insert here, but we must keep searching // to see if the key already exists further down the probe chain. if (firstDeletedSlot == -1) { firstDeletedSlot = probedIndex; } } } } // Search Function int search(std::string key) { // Get the initial hash index int index = hashFunction(key); // Start probing for (int i = 0; i < m; ++i) { int probedIndex = (index + i) % m; auto& slot = table[probedIndex]; // Case 1: Slot is OCCUPIED and key matches if (slot.state == SlotState::OCCUPIED && slot.key == key) { return slot.value; // Item found } // Case 2: Slot is EMPTY if (slot.state == SlotState::EMPTY) { // If we hit an EMPTY slot, the key cannot be // further down the probe chain. return -1; // Not found } // Case 3: Slot is DELETED // We must continue probing, as the key we want // might have collided and be after this deleted slot. if (slot.state == SlotState::DELETED) { continue; } } // We looped through the entire table and didn't find it return -1; } // Remove Function void remove(std::string key) { int index = hashFunction(key); // Start probing to find the key for (int i = 0; i < m; ++i) { int probedIndex = (index + i) % m; auto& slot = table[probedIndex]; // Case 1: Slot is OCCUPIED and key matches if (slot.state == SlotState::OCCUPIED && slot.key == key) { // Found it. Set state to DELETED. slot.state = SlotState::DELETED; n--; // Decrement item count return; } // Case 2: Slot is EMPTY if (slot.state == SlotState::EMPTY) { // Key is not in the table return; } // Case 3: Slot is DELETED if (slot.state == SlotState::DELETED) { // Keep probing continue; } } } }; ``` # 5. Hashing Implementation with C++ STL In C++, instead of creating a Hash Table manually, you could use **Standard Template Library (STL)**. The STL implementation **automatically** handles hash functions, collisions, and **rehashing** when the load factor gets too high. ## 5.1. `std::unordered_map` (Key-Value) `std::unordered_map` is a Hash Map implementation for **Key-Value pairs**. The purpose of this template is to map Keys to Values. It acts like a **dictionary**, where you look up a word (key) to get its definition (value). ```cpp #include #include #include int main() { // Key: string (name), Value: int (grade) std::unordered_map studentGrades; studentGrades["Budi"] = 90; studentGrades["Ani"] = 85; // Access value using key std::cout << "Budi's Grade: " << studentGrades["Budi"] << std::endl; } ``` ## 5.2. `std::unordered_set` (Unique Keys) `std::unordered_set` is a Hash Set implementation that only stores **unique keys**. This template is used to check whether a **key is on the list or not**, like a **guest book**. ```cpp #include #include #include int main() { std::unordered_set attendanceList; attendanceList.insert("Budi"); attendanceList.insert("Ani"); attendanceList.insert("Budi"); // Ignored, because "Budi" already exists // Check existence if (attendanceList.count("Budi") > 0) { std::cout << "Budi is present." << std::endl; } } ``` ## 5.3. Comparison: `map` vs. `set` Aspect | `map` | `set` :----: | :---: | :---: **Feature** | `std::unordered_map` | `std::unordered_map` **What is Stored** | Key-Value pairs | Key only **Element Type** | `std::pair` | `Key` **Main Purpose** | Mapping Keys to Values | Storing unique Keys **Data Access** | `map[key]` (get/set value) | `set.count(key)` (check existence) # 6. Custom Struct Keys A notable limitation of the C++ STL's `std::unordered_map` and `std::unordered_set` is their inability to natively support user-defined types (such as a `struct` or `class`) as keys. An attempt to instantiate a map with a custom struct, as shown below, will result in a compile-time error. ```cpp // This declaration will fail to compile std::unordered_map nilaiUjian; ``` This failure occurs because the STL hash containers have two fundamental requirements for their key types, which C++ cannot automatically generate for a custom `struct`: - **A Hashing Function**: The container must have a mechanism to convert a `Key` object (in this case, `Mahasiswa`) into a `size_t` value. This resulting "hash code" is what determines the bucket where the element will be stored. - **An Equality Function**: The container must know how to compare two Key objects for equality. This is not for sorting, but specifically for handling hash collisions. The map must be able to determine if the key it's looking for is truly present. The compiler cannot, for instance, assume whether two `Mahasiswa` objects are equal if only their `nim` matches, or if both `nim` and `nama` must match. To resolve this, the programmer must explicitly "teach" C++ how to perform these two operations for the custom type. ## 6.1. Defining the Equality Operation The first requirement is met by **overloading the equality operator** (`operator==`) for the custom struct. This is the standard C++ mechanism for defining identity and is used directly by the `unordered_map` to compare keys within a bucket. ```cpp struct Mahasiswa { std::string nama; int nim; }; // This function defines what makes two Mahasiswa objects // "equal" in the eyes of the hash map. bool operator==(const Mahasiswa& a, const Mahasiswa& b) { return a.nama == b.nama && a.nim == b.nim; } ``` ## 6.2. Providing a Hashing Function via `std::hash `Specialization The second requirement is met by providing a custom hash function. The standard mechanism for this is to create a **template specialization** of the `std::hash` struct, which resides in the std namespace. This specialization "injects" our custom hashing logic into the STL, allowing `unordered_map` to find and use it automatically whenever it needs to hash a `Mahasiswa` object. This is achieved by defining `operator()` within the specialized struct. ```cpp #include // Required for std::hash // We "inject" our hashing logic into the std namespace namespace std { // We declare a specialization of the 'hash' template for our type template<> struct hash { // operator() is what the unordered_map will call. // It must be 'const' and return a 'size_t'. size_t operator()(const Mahasiswa& m) const { // Get the hash for each member individually. size_t hashNama = std::hash{}(m.nama); size_t hashNim = std::hash{}(m.nim); // Combine the individual hashes into one final hash. return hashNama ^ (hashNim << 1); } }; } // End of namespace std ``` ## 6.3. Usage Example With both the equality operator and the `std::hash` specialization provided, the `std::unordered_map` now has all the components necessary to manage the Mahasiswa key. The following code will now compile and function as expected ```cpp int main() { std::unordered_map nilaiUjian; Mahasiswa budi = {"Budi", 12345}; nilaiUjian[budi] = 90; // The map can now hash "budi" to find a bucket // and use operator== to check for collisions. } ``` # Module 8 - Graph, Stack, and Queue # 1. Graphs Concept ![](https://learn.digilabdte.com/uploads/images/gallery/2025-11/scaled-1680-/image-1762861479660.png) A **Graph** is a non-linear data structure consisting of **nodes** and **edges**. The nodes are formally called vertices, and the edges are lines or arcs that connect any two nodes in the graph. Graphs are used to solve many real-world problems. They are used to represent networks, such as social networks, transportation networks (roads and cities), or computer networks. ## 1.1 Graph Terminology - **Vertex (or Node):** The fundamental unit of the graph. - **Edge:** A line that connects two vertices. - **Path:** A sequence of vertices connected by edges. - **Cycle:** A path that starts and ends at the same vertex. A graph with no cycles is acyclic. - **Connected Graph:** An undirected graph where there is a path between any two vertices. - **Weighted Graph:** Each edge is assigned a numerical value or "weight," often representing cost or distance. - **Unweighted Graph:** Edges do not have weights. ## 1.2 Graph Analogy Imagine a social network like Facebook or Twitter. - Each **person** is a **Vertex (Node)**. - The **friendship** or **connection** between two people is an **Edge**. - If the friendship is mutual (i.e. Facebook), it's an **undirected** graph. - If you can "follow" someone without them following you back (i.e. Twitter), it's a **directed** graph. ## 1.3 Types of Graph ### 1.3.1 Undirected Graph ![](https://learn.digilabdte.com/uploads/images/gallery/2025-11/scaled-1680-/image-1762861563847.png) In an **Undirected Graph**, edges represent a mutual, bidirectional relationship. An edge `(u, v)` is identical to an edge `(v, u)`. There is no "direction" to the connection. - **Analogy:** A friendship on Facebook or a two-way street. If `A` is friends with `B`, then `B` is automatically friends with `A`. - **Degree:** The "degree" of a vertex is simply the total number of edges connected to it. ### 1.3.2 Directed Graph ![](https://learn.digilabdte.com/uploads/images/gallery/2025-11/scaled-1680-/image-1762861622418.png) In a **Directed Graph (or Digraph)**, edges are one-way streets. An edge `(u, v)` represents a connection from `u` to `v`, but it does not imply a connection from `v` to `u`. - **Analogy:** Following someone on Twitter or a one-way street. `A` can follow `B` (an edge `A -> B`), but `B` does not have to follow `A` back. An edge `B -> A` would be a separate, distinct edge. - **Degree:** Degree is split into two types: - **In-degree:** The number of edges pointing to the vertex. - **Out-degree:** The number of edges pointing away from the vertex. # 2. Graph Representations A graph is an abstract concept. To store one in a computer's memory, we must translate this idea of vertices and edges into a concrete data structure. The choice of representation is critical, as it dictates the performance and space efficiency of our graph algorithms. A good representation must efficiently answer two fundamental questions: 1. "**Is vertex `u` connected to vertex `v`?**" 2. "**What are all the neighbors of vertex `u`?**" The two most common methods to solve this are the Adjacency Matrix and the Adjacency List. ## 2.1. Adjacency Matrix An **Adjacency Matrix** is a 2D array, typically of size `V x V`, where `V` is the number of vertices. Each cell `adj[u][v]` in the matrix stores information about the edge between vertex `u` and vertex `v`. ### 2.1.1. Concept Imagine a flight chart. The rows represent the "source" vertex and the columns represent the "destination" vertex. - For an **unweighted graph**, we store a `1` (or `true`) at `adj[u][v]` if an edge exists, and `0` (or `false`) if it does not. - For a **weighted graph**, instead of storing `0` or `1`, we store the weight of the edge. - For an **undirected graph**, the matrix will be symmetric about the diagonal (i.e., `adj[u][v] == adj[v][u]`). - For a **directed graph**, the matrix is not necessarily symmetric. `matrix[u][v]` can be `1` while `matrix[v][u]` is `0` ### 2.1.2. Example Consider this simple undirected graph with 4 vertices: ``` (0) --- (1) | \ | | \ | | \ | | \ | | \ | | \ | | \ | (3) --- (2) ``` **Edges:** (0,1), (0,2), (0,3), (1,2), (2,3) The **Adjacency Matrix** for this graph would be: ``` 0 1 2 3 0[0,1,1,1] 1[1,0,1,0] 2[1,1,0,1] 3[1,0,1,0] ``` ### 2.1.3 Implementation (C++) In C++, you would implement this using a `vector` of `vector`s. ```cpp // Assume V is the number of vertices int V = 4; // 1. Initialize a V x V matrix with all zeros std::vector> adjMatrix( V, std::vector(V, 0) ); // 2. Add an edge (e.g., between 0 and 2) void addEdge(int u, int v) { adjMatrix[u][v] = 1; adjMatrix[v][u] = 1; // For undirected } // 3. To check for an edge (e.g., between 1 and 3): bool hasEdge = (adjMatrix[1][3] == 1); // false // 4. To find all neighbors of a vertex (e.g., vertex 0): for (int j = 0; j < V; ++j) { if (adjMatrix[0][j] == 1) { // j is a neighbor of 0 // This will find j=1, j=2, and j=3 } } ``` ### 2.1.4. Pros and Cons **Pros:** - **Fast Edge Lookup:** Checking if an edge `(u, v)` exists is `O(1)`. You just access the `adjMatrix[u][v]` cell. - **Simple to Implement:** The logic is straightforward, as it's just a 2D array lookup. **Cons:** - **Space Inefficient:** This is the biggest drawback. The matrix always requires `O(V^2)` space, regardless of how many edges the graph has. - **Slow Neighbor Iteration:** Finding all neighbors of a vertex requires iterating through its entire row, which is an `O(V)` operation, even if the vertex only has one neighbor. ## 2.2. Adjacency List An **Adjacency List** is an array (or `std::vector`) of lists. The main array has `V` entries, where each entry `adj[u]` corresponds to vertex `u`. The entry `adj[u]` then holds a list of all other vertices `v` that are neighbors of `u`. ### 2.2.1. Concept Imagine a phone's contacts list. The main array (of size `V`) acts like your address book index. Each entry in this index (e.g., `adj[u]`) doesn't hold a single value, but rather points to a list of all that vertex's direct neighbors. - For an **unweighted graph**, the list for `adj[u]` simply stores the indices (like `v1`, `v2`) of all its neighbors. - For a **weighted graph**, the list for `adj[u]` stores pairs, where each pair contains the neighbor's index and the edge's weight (e.g., `{v1, w1}`, `{v2, w2}`). - For an **undirected graph**, when you add an edge `(u, v)`, you must add `v` to `adj[u]`'s list and add `u` to `adj[v]`'s list. - For a **directed graph**, When you add an edge `(u, v)`, you only add `v` to `adj[u]`'s list. ### 2.2.2. Example Using the same 4-vertex graph: ``` (0) --- (1) | \ | | \ | | \ | | \ | | \ | | \ | | \ | (3) --- (2) ```` The **Adjacency List** for this graph would be: ``` 0: -> [1, 2, 3] 1: -> [0, 2] 2: -> [0, 1, 3] 3: -> [0, 2] ``` ### 2.2.3. Implementation (C++) In C++, you would implement this using a `vector` of `list`s (or a `vector` of `vector`s). ```cpp // Assume V is the number of vertices int V = 4; // 1. Initialize a vector of size V. Each element // is an empty list. std::vector> adjList(V); // 2. Add an edge (e.g., between 0 and 2) void addEdge(int u, int v) { adjList[u].push_back(v); adjList[v].push_back(u); // For undirected } // 3. To check for an edge (e.g., between 1 and 3): // This is slower than the matrix. bool hasEdge = false; for (auto& neighbor : adjList[1]) { if (neighbor == 3) { hasEdge = true; break; } } // hasEdge is false // 4. To find all neighbors of a vertex (e.g., vertex 0): // This is extremely fast. for (auto& neighbor : adjList[0]) { // neighbor is a neighbor of 0 // This will find neighbor=1, neighbor=2, and neighbor=3 } ``` ### 2.2.4. Handling Weighted Graphs To store weights, the list cannot just be of `int`s. It must store pairs (or a custom `struct`). - `std::vector>> adjList(V);` - The `pair` would store `{neighbor, weight}`. - `addEdge` would look like: `adjList[u].push_back({v, weight});` ### 2.3.5. Pros and Cons **Pros:** - **Space Efficient:** This is the primary advantage. The total space required is `O(V + E)`. - **Fast Neighbor Iteration:** Finding all neighbors of a vertex u is `O(deg(u))` (where `deg(u)` is the degree, or number of neighbors). This is optimally fast. **Cons:** - **Slow Edge Lookup:** Checking if a specific edge `(u, v)` exists is `O(deg(u))` in the worst case, because we must iterate through the list `adj[u]` to see if `v` is in it. ## 2.3. Comparison: Matrix vs. List | **Operation** | **Adjacency Matrix** | **Adjacency List** | |---------------|----------------------|------------------| | **Space** | `O(V²)` | `O(V + E)` | | **Add Edge** | `O(1)` | `O(1)` | | **Check Edge (u, v)** | `O(1)` (Fast) | `O(deg(u))` (Slow) | | **Find Neighbors of u** | `O(V)` (Slow) | `O(deg(u))` (Fast) | | **Remove Edge (u, v)** | `O(1)` | `O(deg(u))` | | **Best For** | **Dense Graphs** (where `E` ≈ `V²`) | **Sparse Graphs** (most common case) | # 3. Stack and Queue Before diving into graph traversal, we must understand the two key data structures that power them: `std::stack` (for DFS) and `std::queue` (for BFS). ## 3.1 Introduction to `std::stack` ![Stack](https://learn.digilabdte.com/uploads/images/gallery/2025-11/image-1762861155473.png) `std::stack` is a **container adapter** in the C++ STL. It's not a container itself, but a "wrapper" that provides a specific **LIFO (Last-In, First-Out)** interface on top of another container (like `std::deque` by default). It's like a stack of plates. You add new plates to the top and remove plates from the top. **Key Operations:** | Operation | Description | | :--- | :--- | | `push(item)` | Adds an item to the top of the stack. | | `pop()` | Removes the item from the top of the stack. | | `top()` | Returns a reference to the item at the top. | | `empty()` | Returns true if the stack is empty. | | `size()` | Returns the number of items in the stack. | **Example:** ```cpp #include #include int main() { std::stack myStack; myStack.push(10); // Stack: [10] myStack.push(20); // Stack: [10, 20] myStack.push(30); // Stack: [10, 20, 30] std::cout << "Top: " << myStack.top() << std::endl; // Prints 30 myStack.pop(); // Removes 30. Stack: [10, 20] std::cout << "Top: " << myStack.top() << std::endl; // Prints 20 return 0; } ``` ## 3.2 Introduction to `std::queue` ![Queue](https://learn.digilabdte.com/uploads/images/gallery/2025-11/image-1762861324380.png) `std::queue` is also a **container adapter**. It provides a **FIFO (First-In, First-Out)** interface. It's like a line at a ticket counter. The first person to get in line is the first person to be served. **Key Operations:** | Operation | Description | | :--- | :--- | | push(item) | Adds an item to the back of the queue. | | pop() | Removes the item from the front of the queue. | | front() | Returns a reference to the item at the front. | | back() | Returns a reference to the item at the back. | | empty() | Returns true if the queue is empty. | | size() | Returns the number of items in the queue. | **Example:** ```cpp #include #include int main() { std::queue myQueue; myQueue.push(10); // Queue: [10] myQueue.push(20); // Queue: [10, 20] myQueue.push(30); // Queue: [10, 20, 30] std::cout << "Front: " << myQueue.front() << std::endl; // Prints 10 myQueue.pop(); // Removes 10. Queue: [20, 30] std::cout << "Front: " << myQueue.front() << std::endl; // Prints 20 return 0; } ``` # 4. Graph Traversal: Breadth-First Search (BFS) **Breadth-First Search (BFS)** is a traversal algorithm that explores the graph "**level by level**." It starts at a source vertex, explores all of its immediate neighbors, then all of their neighbors, and so on. ## 4.1 BFS Analogy and Illustration Imagine a **ripple effect in a pond**. When you drop a stone (the **source vertex**), the ripple expands: - It first hits all the water molecules immediately next to it (Level 1 Neighbors). - Then, the ripple expands to hit the next layer of molecules (Level 2 Neighbors). BFS works exactly this way. It is excellent for finding the **shortest path** in an **unweighted graph**. **Example:** We want to trace BFS starting from node `0` on the following graph: ``` 0 / \ 1---2 / \ 3 4 ``` The traversal order will be `0, 1, 2, 3, 4`. ## 4.2 The Role of `std::queue` **The FIFO (First-In, First-Out)** nature of a queue is perfect for BFS. 1. We add the `source` (Level 0) to the queue. 2. We dequeue the `source`, and add all its neighbors (Level 1) to the queue 3. We dequeue all the Level 1 nodes one by one, and as we do, we add their neighbors (Level 2) to the **back** of the queue. This ensures we process all nodes at the current level **before** moving to the next level, just like the ripple effect. ## 4.3 BFS Implementation BFS is implemented using an **iterative approach** with a `std::queue`. This approach is natural to the algorithm's "level-by-level" logic. The queue ensures that nodes are processed in the order they are discovered. We also use a visited array (or `std::vector`) to keep track of nodes we have already processed or added to the queue, which is crucial for preventing infinite loops in graphs with cycles. ```cpp void BFS(int s) { // 's' is the source vertex // 1. Create a visited array, initialized to all false std::vector visited(V, false); // 2. Create a Queue for BFS std::queue q; // 3. Mark the source vertex as visited and enqueue it visited[s] = true; q.push(s); std::cout << "BFS Traversal: "; // 4. Loop as long as the queue is not empty while (!q.empty()) { // 5. Dequeue a vertex from the front and print it int u = q.front(); q.pop(); std::cout << u << " "; // 6. Get all neighbors of the dequeued vertex 'u' for (auto& neighbor : adj[u]) { // 7. If a neighbor has not been visited: if (!visited[neighbor]) { // Mark it as visited visited[neighbor] = true; // Enqueue it (add to the back of the queue) q.push(neighbor); } } } std::cout << std::endl; } ``` **Code Explanation:** - Initialize a `visited` array (all `false`) and an empty `queue`. - Mark the starting node `s` as visited and add it to the queue. - While the queue is **not empty**: - Dequeue `a` node `u` (this is the **current node**). - Print `u`. - Iterate through all neighbors of `u`. - If a neighbor has not been visited, mark it as visited and enqueue it. # 5. Graph Traversal: Depth-First Search (DFS) **Depth-First Search (DFS)** is a traversal algorithm that explores the graph by going as "**deep**" as possible down one path before backtracking. ## 5.1 DFS Analogy and Illustration Imagine exploring a **maze** with only one path. - You pick a direction (an **edge**) and follow it. - You keep going, taking the **first available turn** at each junction (visiting the first unvisited neighbor). - You continue until you hit a **dead end** (a node with no unvisited neighbors). - At the dead end, you **backtrack** to the previous junction and **try a different path**. DFS works this way. It is excellent for detecting cycles, topological sorting, and solving puzzles. **Example:** We want to trace BFS starting from node `0` on the following graph: ``` 0 / \ 1---2 / \ 3 4 ``` The traversal order will be `0, 1, 2, 4, 3`. ## 5.2 The Role of `std::stack` **The LIFO (Last-In, First-Out)** nature of a stack is perfect for DFS. - **Recursive DFS:** The system's call stack is used implicitly. When you make a **recursive call** `DFSUtil(neighbor)`, you **"push" the new function** onto the stack. When you hit a dead end, the function returns, **"popping" itself off** the stack and automatically "backtracking" to the previous node. - **Iterative DFS:** We manually use a `std::stack`. We **push a node**, then **push its neighbors**. The last neighbor pushed is on **top**, so it will be the first one we explore next, perfectly mimicking the "go deep" behavior of recursion. ## 5.3 DFS Implementation ### 5.3.1 Iterative Approach This approach uses an **explicit `std::stack`** and avoids recursion. ```cpp void DFS(int s) { // 1. Create a visited array, initialized to all false std::vector visited(V, false); // 2. Create a Stack for DFS std::stack stack; // 3. Push the source vertex onto the stack stack.push(s); std::cout << "DFS Traversal (Iterative): "; // 4. Loop as long as the stack is not empty while (!stack.empty()) { // 5. Pop a vertex from the top of the stack int u = stack.top(); stack.pop(); // 6. If it hasn't been visited yet: if (!visited[u]) { visited[u] = true; std::cout << u << " "; } // 7. Get all neighbors of 'u' for (auto& neighbor : adj[u]) { // 8. If the neighbor is unvisited, push it if (!visited[neighbor]) { stack.push(neighbor); } } } std::cout << std::endl; } ``` **Code Explanation:** - Initialize a `visited` array (all `false`) and an empty `stack`. - Push the starting node `s` onto the stack. - While the stack is not empty: - Pop a node `u` from the stack. - If `u` has not been visited: - Mark `u` as visited and print it. - Iterate through all neighbors of `u`. - If a neighbor has not been visited, push it onto the stack (to be processed next). ### 5.3.2 Recursive Approach This is the most common and intuitive way to implement DFS. It uses a **helper function**. ```cpp // Private helper function for recursive DFS void DFSUtil(int u, std::vector& visited) { // 1. Mark the current node as visited and print it visited[u] = true; std::cout << u << " "; // 2. Iterate over all neighbors for (auto& neighbor : adj[u]) { // 3. If a neighbor is unvisited: if (!visited[neighbor]) { // 4. Make a recursive call to "go deeper" DFSUtil(neighbor, visited); } } // 5. When all neighbors are visited, the function returns. // This is the implicit "backtracking" step. } // Public wrapper function to start the DFS void DFSRecursive(int s) { // 1. Create a visited array std::vector visited(V, false); std::cout << "DFS Traversal (Recursive): "; // 2. Call the helper function to start the recursion DFSUtil(s, visited); std::cout << std::endl; } ``` **Code Explanation:** - The `DFSRecursive` function initializes the `visited` array. - It calls the helper function `DFSUtil` with the starting node `s`. - Inside `DFSUtil` (the recursive part): - Marks the current node `u` as visited and prints it. - Iterates through all neighbors of `u`. - If a neighbor has not been visited, it immediately calls `DFSUtil` on that neighbor, "**going deeper**." - When a node has no unvisited neighbors, its function call finishes and "**backtracks**" to the previous node's loop. ### 5.3.3 Comparison: Iterative vs. Recursive Both recursive and iterative DFS **accomplish the same goal**, but they differ in implementation and performance characteristics. |**Aspect**|**Recursive DFS**|**Iterative DFS**| |---|---|---| |**Logic**|Intuitive and clean. Backtracking is handled **implicitly** by function returns.|Backtracking is handled **explicitly** using a `std::stack`.| |**Data Structure**|Uses the **Call Stack** (managed by the system).|Uses an explicit `std::stack` (managed by the programmer on the heap).| |**Memory Usage**|Can cause a **Stack Overflow** error if the graph is very deep (long paths).|Safer for very deep graphs, as heap memory is much larger than stack memory.| |**Code Simplicity**|Often shorter and easier to write and understand the core "go deep" logic.|Can be more complex to write, but the process is more transparent.| **In summary:** - Use **Recursive DFS** for simplicity and clarity, especially when the graph depth is **known to be manageable**. - Use **Iterative DFS** when you need to avoid stack overflow on **very deep graphs** or when you need **finer control over the traversal**. # 6. Example: Full Code Implementation This chapter combines **all the concepts** into a **single Graph class** using the C++ STL for the adjacency list. ```cpp #include #include #include // For Adjacency List #include // For BFS #include // For Iterative DFS #include // For visited tracker class Graph { private: int V; // Number of vertices // Adjacency List: A vector of lists std::vector> adj; // Helper for recursive DFS void DFSUtil(int u, std::vector& visited) { visited[u] = true; std::cout << u << " "; for (auto& neighbor : adj[u]) { if (!visited[neighbor]) { DFSUtil(neighbor, visited); } } } public: // Constructor Graph(int V) { this->V = V; adj.resize(V); // Resize the vector to hold V lists } // Add an edge (for an undirected graph) void addEdge(int u, int v) { adj[u].push_back(v); adj[v].push_back(u); // Comment this line for a directed graph } // Breadth-First Search (Iterative) void BFS(int s) { std::vector visited(V, false); std::queue q; visited[s] = true; q.push(s); std::cout << "BFS Traversal: "; while (!q.empty()) { int u = q.front(); q.pop(); std::cout << u << " "; for (auto& neighbor : adj[u]) { if (!visited[neighbor]) { visited[neighbor] = true; q.push(neighbor); } } } std::cout << std::endl; } // Depth-First Search (Iterative) void DFS(int s) { std::vector visited(V, false); std::stack stack; stack.push(s); std::cout << "DFS Traversal (Iterative): "; while (!stack.empty()) { int u = stack.top(); stack.pop(); if (!visited[u]) { visited[u] = true; std::cout << u << " "; } for (auto& neighbor : adj[u]) { if (!visited[neighbor]) { stack.push(neighbor); } } } std::cout << std::endl; } // Depth-First Search (Recursive) void DFSRecursive(int s) { std::vector visited(V, false); std::cout << "DFS Traversal (Recursive): "; DFSUtil(s, visited); std::cout << std::endl; } }; ``` # Module 9 - Advanced Graph # 1. Introduction # Module 9: Advanced Graph **Author:** YP ## **Learning Objectives** After completing this module, students are expected to be able to: 1. Explain the basic concepts of **Graph** (including adjacency list and adjacency matrix representation). 2. Implement graph traversal using **DFS (Depth First Search)** and **BFS (Breadth First Search)**. 3. Understand and implement the **Shortest Path algorithm (Dijkstra)**. 4. Implement **Minimum Spanning Tree (MST)** algorithms: **Prim** and **Kruskal**. 5. Compare the efficiency of manual graph algorithms with STL libraries such as **priority_queue** and **disjoint set union (DSU)**. --- # 2. Graph Concept ## **2.1 Theory** ### 2.1.1 Definition of Graph A **Graph** is a data structure consisting of: - **Vertex (node)** → represents an object. - **Edge** → represents the relationship between objects. Graphs can be: - **Directed** or **Undirected**. - **Weighted** or **Unweighted**. --- ### 2.1.2 Graph Representation There are two main representations of graphs in programming: 1. **Adjacency Matrix** A 2D matrix [V][V], where V is the number of vertices. Each element indicates whether an edge exists between two nodes (commonly 1 for an edge, 0 for no edge). - Suitable for **dense graphs**. - Edge lookup is efficient with **O(1)** time complexity. ```cpp= int graph[5][5] = { {0, 1, 0, 0, 1}, {1, 0, 1, 1, 0}, {0, 1, 0, 1, 0}, {0, 1, 1, 0, 1}, {1, 0, 0, 1, 0} }; ``` 2. **Adjacency List** Each vertex stores a list of its adjacent vertices (neighbors). - Memory-efficient for **sparse graphs**, as only existing edges are stored. - Edge lookup may take up to **O(V)** in the worst case. ```cpp= vector adj[5]; adj[0].push_back(1); adj[0].push_back(4); adj[1].push_back(0); adj[1].push_back(2); adj[1].push_back(3); ``` --- ### 2.1.3 Graph Traversal Graph traversal can be performed in two main ways: 1. **Breadth First Search (BFS)** - Uses a queue. - Explores nodes level by level (broad exploration). - Effective in unweighted graphs to find the shortest path from one node to another. ```cpp= void BFS(int start, vector adj[], int V) { vector visited(V, false); queue q; visited[start] = true; q.push(start); while (!q.empty()) { int u = q.front(); q.pop(); cout << u << " "; for (int v : adj[u]) { if (!visited[v]) { visited[v] = true; q.push(v); } } } } ``` 2. **Depth First Search (DFS)** - Uses recursion or a stack. - Explores as deep as possible before backtracking. - Useful for detecting cycles, performing topological sort, or finding connected components. ```cpp= void DFSUtil(int u, vector adj[], vector& visited) { visited[u] = true; cout << u << " "; for (int v : adj[u]) { if (!visited[v]) { DFSUtil(v, adj, visited); } } } void DFS(int start, vector adj[], int V) { vector visited(V, false); DFSUtil(start, adj, visited); } ``` --- ### 2.1.4 Shortest Path Algorithm: Dijkstra - Finds the shortest path from a source node to all other nodes in a **positively weighted graph**. - Uses a priority queue (min-heap) to efficiently select edges with the smallest weights. - Time complexity: **O((V + E) log V)**. ```cpp= void dijkstra(int V, vector> adj[], int src) { vector dist(V, INT_MAX); dist[src] = 0; priority_queue, vector>, greater>> pq; pq.push({0, src}); while (!pq.empty()) { int u = pq.top().second; pq.pop(); for (auto [v, w] : adj[u]) { if (dist[u] + w < dist[v]) { dist[v] = dist[u] + w; pq.push({dist[v], v}); } } } cout << "Jarak terpendek dari " << src << ":\n"; for (int i = 0; i < V; i++) { cout << "Ke " << i << " = " << dist[i] << endl; } } ``` --- ### 2.1.5 Minimum Spanning Tree (MST) #### Prim’s Algorithm - Starts from a single node, repeatedly adding the minimum weight edge that connects a new node to the tree. - Efficient with **O(E log V)** complexity using a priority queue. - Suitable for **dense graphs**. ```cpp= #include using namespace std; const int INF = 1e9; void primMST(int n, vector>> &adj) { vector key(n, INF); vector inMST(n, false); vector parent(n, -1); // Mulai dari node 0 key[0] = 0; priority_queue, vector>, greater<>> pq; pq.push({0, 0}); // {weight, node} while (!pq.empty()) { int u = pq.top().second; pq.pop(); if (inMST[u]) continue; inMST[u] = true; for (auto &[v, w] : adj[u]) { if (!inMST[v] && w < key[v]) { key[v] = w; pq.push({key[v], v}); parent[v] = u; } } } cout << "Edges in MST (Prim):\n"; for (int i = 1; i < n; i++) { cout << parent[i] << " - " << i << "\n"; } } int main() { int n = 5; vector>> adj(n); // contoh graf berbobot (undirected) adj[0].push_back({1, 2}); adj[1].push_back({0, 2}); adj[0].push_back({3, 6}); adj[3].push_back({0, 6}); adj[1].push_back({2, 3}); adj[2].push_back({1, 3}); adj[1].push_back({3, 8}); adj[3].push_back({1, 8}); adj[1].push_back({4, 5}); adj[4].push_back({1, 5}); adj[2].push_back({4, 7}); adj[4].push_back({2, 7}); primMST(n, adj); } ``` #### Kruskal’s Algorithm - Sorts all edges by weight, then adds them one by one as long as they do not form a cycle. - Uses **Disjoint Set Union (DSU)** to detect and prevent cycles. - Time complexity: **O(E log E)**. - More efficient for **sparse graphs**. ```cpp= #include using namespace std; struct Edge { int u, v, w; bool operator<(const Edge &other) const { return w < other.w; } }; struct DSU { vector parent, rank; DSU(int n) { parent.resize(n); rank.resize(n, 0); iota(parent.begin(), parent.end(), 0); } int find(int x) { if (x != parent[x]) parent[x] = find(parent[x]); return parent[x]; } bool unite(int a, int b) { a = find(a); b = find(b); if (a == b) return false; if (rank[a] < rank[b]) swap(a, b); parent[b] = a; if (rank[a] == rank[b]) rank[a]++; return true; } }; void kruskalMST(int n, vector &edges) { sort(edges.begin(), edges.end()); DSU dsu(n); cout << "Edges in MST (Kruskal):\n"; for (auto &e : edges) { if (dsu.unite(e.u, e.v)) { cout << e.u << " - " << e.v << "\n"; } } } int main() { int n = 5; vector edges = { {0, 1, 2}, {0, 3, 6}, {1, 2, 3}, {1, 3, 8}, {1, 4, 5}, {2, 4, 7} }; kruskalMST(n, edges); } ``` --- ## **2.2 Example of Graph Traversal** Example of an undirected weighted graph: ![image](https://hackmd.io/_uploads/By8S6lNFex.png) - **BFS (starting from node 0)** Traversal per level (broad): **0 → 1 → 2 → 3** - **DFS (starting from node 0)** Traversal as deep as possible before backtracking: **0 → 1 → 3 → 2** *(order may vary depending on implementation, e.g., 0 → 1 → 2 → 3)* - **Dijkstra (shortest path from node 0)** - Distance to **0** = 0 - Distance to **1** = 4 - Distance to **2** = 1 - Distance to **3** = 6 # Module 10 - Dynamic Programming # 1. Introduction to DP ### What is DP? Dynamic programming (DP) is defined as a powerful design technique that successfully combines the correctness of brute force algorithms with the efficiency of greedy algorithms. The fundamental idea of dynamic programming is to remember the solutions to subproblems and reuse them to solve larger problems. DP involves identifying and solving all relevant sub problems. **Common use cases for dynamic programming include**: 1. Finding an optimal solution, such as determining a minimum or maximum answer to a question. 2. Counting the total number of solutions that a problem may have. A primary challenge when designing a DP solution is figuring out what sub problems can help solve the main problem. For simpler problems, the subproblems often have the same form as the actual problem. ### DP Methodologies Dynamic programming problems can generally be solved using one of two approaches: 1. **Top-Down Approach (Memoization)** This approach begins with the final problem and recursively computes the sub problems when needed. - Memorization is the technique used here, which means "remembering". The idea is to remember the computation of each value (like a Fibonacci number) and compute it at most once. - This is typically achieved by storing the computed value in a dictionary, often called a memo. - When the function is called, it first checks if the value is in the dictionary; if so, the computation is replaced by a single Dictionary lookup, avoiding the need to run the entire recursive computation again. This significantly speeds up solutions, such as converting an exponential-time recursive Fibonacci solution into a linear-time memoized solution
2. **Bottom-Up Approach (Tabulation)** This approach involves computing the sub problems first, starting from the base case, and working up to the final solution. - Most people prefer the bottom-up approach because it typically uses iteration (like a for loop) instead of recursion, which means it doesn't have function calls and is often more efficient in practice. - It may also allow for savings in memory in some cases (e.g., when solving the Fibonacci problem, only the last two values need to be remembered). - When using the bottom-up method, attention must be paid to the order in which sub problems are solved. The dependencies of a subproblem must be solved first. If each subproblem is considered a node with edges representing dependencies, the sub problems must be solved in the topological sort order.
While the bottom-up approach has a lower constant factor and is slightly shorter to implement, thinking about DP problems in terms of recursive functions is often easier for the conceptualization phase. # Code & Examples 1 ### DP for Fibonacci Problem To illustrate Dynamic Programming, let's look at the classic **Fibonacci Sequence**. The rule is simple: each number is the sum of the two preceding ones ($F(n) = F(n-1) + F(n-2)$), starting with 0 and 1. #### 1. The Original Solution (Naive Recursion) This is the most direct translation of the mathematical formula into code using recursion. ```cpp #include using namespace std; long long fibonacci(int n) { if (n <= 1) return n; return fibonacci(n - 1) + fibonacci(n - 2); } int main() { int n; cout << "Masukkan nilai n: "; cin >> n; cout << "Fibonacci(" << n << ") = " << fibonacci(n) << endl; return 0; } ``` **The Problem with Naive Recursion** While the code above looks clean, it is incredibly inefficient for larger numbers (Exponential Time Complexity: O(2n)). ![picture 0](https://i.imgur.com/u8G5a0E.png) The problem is Overlapping Subproblems. The algorithm calculates the same values over and over again. To calculate fib(5), it calculates fib(4) and fib(3). To calculate fib(4), it calculates fib(3) and fib(2). Here, fib(3) is computed twice from scratch. As n grows, the number of redundant calculations explodes, causing the program to freeze or crash. #### Solution A. Top-Down Approach (Memoization) In this approach, we still use recursion, but we create a "memo" (a dictionary or array) to store results we've already found. Before calculating fib(n), we check if we already have the answer. ```cpp #include #include long long fibonacci(int n, std::vector &memo) { if (n <= 1) return n; if (memo[n] != -1) return memo[n]; memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo); return memo[n]; } int main() { int n; std::cout << "Masukkan nilai n: "; std::cin >> n; std::vector memo(n + 1, -1); std::cout << "Fibonacci(" << n << ") = " << fibonacci(n, memo) << std::endl; return 0; } ``` Result: Time complexity drops to Linear Time O(n). #### Solution B. Bottom-Up Approach (Tabulation) In this approach, we abandon recursion. We start from the base cases (0 and 1) and iteratively fill a table (list) until we reach n. This avoids the overhead of function call stacks. ```cpp #include #include #define MAX 100 // Maximum n value long long fibonacci(int n) { std::vector dp(n + 1); dp[0] = 0; dp[1] = 1; for (int i = 2; i <= n; i++) { dp[i] = dp[i - 1] + dp[i - 2]; } return dp[n]; } int main() { int n; std::cout << "Masukkan nilai n: "; std::cin >> n; std::cout << "Fibonacci(" << n << ") = " << fibonacci(n) << std::endl; return 0; } ``` Result: Linear Time O(n) and often faster in practice due to no recursion overhead. # Code & Examples 2 # Knapsack Problem Given n items where each item has some weight and profit associated with it and also given a bag with capacity W, [i.e., the bag can hold at most W weight in it]. The task is to put the items into the bag such that the sum of profits associated with them is the maximum possible. The constraint here is we can either put an item completely into the bag or cannot put it at all [It is not possible to put a part of an item into the bag]. ![maxresdefault.jpg](https://learn.digilabdte.com/attachments/2) **Solution:** ```cpp #include using namespace std; // Function to find the maximum profit in a knapsack int knapsack(int W, vector &val, vector &wt) { // Initialize dp array with initial value 0 vector dp(W + 1, 0); // Iterate through each item, starting from the 1st item to the nth item for (int i = 1; i <= wt.size(); i++) { // Start from the back, so we also have data from // previous calculations of i-1 items and avoid duplication for (int j = W; j >= wt[i - 1]; j--) { dp[j] = max(dp[j], dp[j - wt[i - 1]] + val[i - 1]); } } return dp[W]; } int main() { vector val = {1, 2, 3}; vector wt = {4, 5, 1}; int W = 4; cout << knapsack(W, val, wt) << endl; return 0; } ``` **Explanation:** The code above is already optimized as it saves memory usage and execution time. The analogy for this program is as follows: - We have a bag (knapsack) with a certain capacity (W). - We have several items with certain values (val) and weights (wt). - Our goal is to fill the bag with items so that the total value of items in the bag is maximized without exceeding the bag's capacity. - We use a dynamic programming approach to solve this problem efficiently. - First, we iterate through each item, and for each item, we iterate the bag capacity from back to front (e.g., from maximum weight to 0). - This way, we will find the maximum value that can be put into the bag without exceeding its capacity. - The code already covers the situation of comparing between using the item with the highest weight or adding an item with a lower weight with the remaining capacity.