--- comments: true --- # 8.1 Heap A heap is a complete binary tree that satisfies specific conditions and can be mainly divided into two types, as shown in Figure 8-1. - Min heap: The value of any node $\leq$ the values of its child nodes. - Max heap: The value of any node $\geq$ the values of its child nodes. ![Min heap and max heap](heap.assets/min_heap_and_max_heap.png){ class="animation-figure" }
Figure 8-1 Min heap and max heap
As a special case of a complete binary tree, heaps have the following characteristics: - The bottom layer nodes are filled from left to right, and nodes in other layers are fully filled. - The root node of the binary tree is called the "heap top," and the bottom-rightmost node is called the "heap bottom." - For max heaps (min heaps), the value of the heap top element (root node) is the largest (smallest). ## 8.1.1 Common operations on heaps It should be noted that many programming languages provide a priority queue, which is an abstract data structure defined as a queue with priority sorting. In fact, **heaps are often used to implement priority queues, with max heaps equivalent to priority queues where elements are dequeued in descending order**. From a usage perspective, we can consider "priority queue" and "heap" as equivalent data structures. Therefore, this book does not make a special distinction between the two, uniformly referring to them as "heap." Common operations on heaps are shown in Table 8-1, and the method names depend on the programming language.Table 8-1 Efficiency of Heap Operations
Figure 8-2 Representation and storage of heaps
We can encapsulate the index mapping formula into functions for convenient later use: === "Python" ```python title="my_heap.py" def left(self, i: int) -> int: """Get index of left child node""" return 2 * i + 1 def right(self, i: int) -> int: """Get index of right child node""" return 2 * i + 2 def parent(self, i: int) -> int: """Get index of parent node""" return (i - 1) // 2 # Integer division down ``` === "C++" ```cpp title="my_heap.cpp" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "Java" ```java title="my_heap.java" /* Get index of left child node */ int left(int i) { return 2 * i + 1; } /* Get index of right child node */ int right(int i) { return 2 * i + 2; } /* Get index of parent node */ int parent(int i) { return (i - 1) / 2; // Integer division down } ``` === "C#" ```csharp title="my_heap.cs" [class]{MaxHeap}-[func]{Left} [class]{MaxHeap}-[func]{Right} [class]{MaxHeap}-[func]{Parent} ``` === "Go" ```go title="my_heap.go" [class]{maxHeap}-[func]{left} [class]{maxHeap}-[func]{right} [class]{maxHeap}-[func]{parent} ``` === "Swift" ```swift title="my_heap.swift" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "JS" ```javascript title="my_heap.js" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "TS" ```typescript title="my_heap.ts" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "Dart" ```dart title="my_heap.dart" [class]{MaxHeap}-[func]{_left} [class]{MaxHeap}-[func]{_right} [class]{MaxHeap}-[func]{_parent} ``` === "Rust" ```rust title="my_heap.rs" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "C" ```c title="my_heap.c" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "Kotlin" ```kotlin title="my_heap.kt" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "Ruby" ```ruby title="my_heap.rb" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` === "Zig" ```zig title="my_heap.zig" [class]{MaxHeap}-[func]{left} [class]{MaxHeap}-[func]{right} [class]{MaxHeap}-[func]{parent} ``` ### 2. Accessing the top element of the heap The top element of the heap is the root node of the binary tree, which is also the first element of the list: === "Python" ```python title="my_heap.py" def peek(self) -> int: """Access heap top element""" return self.max_heap[0] ``` === "C++" ```cpp title="my_heap.cpp" [class]{MaxHeap}-[func]{peek} ``` === "Java" ```java title="my_heap.java" /* Access heap top element */ int peek() { return maxHeap.get(0); } ``` === "C#" ```csharp title="my_heap.cs" [class]{MaxHeap}-[func]{Peek} ``` === "Go" ```go title="my_heap.go" [class]{maxHeap}-[func]{peek} ``` === "Swift" ```swift title="my_heap.swift" [class]{MaxHeap}-[func]{peek} ``` === "JS" ```javascript title="my_heap.js" [class]{MaxHeap}-[func]{peek} ``` === "TS" ```typescript title="my_heap.ts" [class]{MaxHeap}-[func]{peek} ``` === "Dart" ```dart title="my_heap.dart" [class]{MaxHeap}-[func]{peek} ``` === "Rust" ```rust title="my_heap.rs" [class]{MaxHeap}-[func]{peek} ``` === "C" ```c title="my_heap.c" [class]{MaxHeap}-[func]{peek} ``` === "Kotlin" ```kotlin title="my_heap.kt" [class]{MaxHeap}-[func]{peek} ``` === "Ruby" ```ruby title="my_heap.rb" [class]{MaxHeap}-[func]{peek} ``` === "Zig" ```zig title="my_heap.zig" [class]{MaxHeap}-[func]{peek} ``` ### 3. Inserting an element into the heap Given an element `val`, we first add it to the bottom of the heap. After addition, since `val` may be larger than other elements in the heap, the heap's integrity might be compromised, **thus it's necessary to repair the path from the inserted node to the root node**. This operation is called heapifying. Considering starting from the node inserted, **perform heapify from bottom to top**. As shown in Figure 8-3, we compare the value of the inserted node with its parent node, and if the inserted node is larger, we swap them. Then continue this operation, repairing each node in the heap from bottom to top until passing the root node or encountering a node that does not need to be swapped. === "<1>" ![Steps of element insertion into the heap](heap.assets/heap_push_step1.png){ class="animation-figure" } === "<2>" ![heap_push_step2](heap.assets/heap_push_step2.png){ class="animation-figure" } === "<3>" ![heap_push_step3](heap.assets/heap_push_step3.png){ class="animation-figure" } === "<4>" ![heap_push_step4](heap.assets/heap_push_step4.png){ class="animation-figure" } === "<5>" ![heap_push_step5](heap.assets/heap_push_step5.png){ class="animation-figure" } === "<6>" ![heap_push_step6](heap.assets/heap_push_step6.png){ class="animation-figure" } === "<7>" ![heap_push_step7](heap.assets/heap_push_step7.png){ class="animation-figure" } === "<8>" ![heap_push_step8](heap.assets/heap_push_step8.png){ class="animation-figure" } === "<9>" ![heap_push_step9](heap.assets/heap_push_step9.png){ class="animation-figure" }Figure 8-3 Steps of element insertion into the heap
Given a total of $n$ nodes, the height of the tree is $O(\log n)$. Hence, the loop iterations for the heapify operation are at most $O(\log n)$, **making the time complexity of the element insertion operation $O(\log n)$**. The code is as shown: === "Python" ```python title="my_heap.py" def push(self, val: int): """Push the element into heap""" # Add node self.max_heap.append(val) # Heapify from bottom to top self.sift_up(self.size() - 1) def sift_up(self, i: int): """Start heapifying node i, from bottom to top""" while True: # Get parent node of node i p = self.parent(i) # When "crossing the root node" or "node does not need repair", end heapification if p < 0 or self.max_heap[i] <= self.max_heap[p]: break # Swap two nodes self.swap(i, p) # Loop upwards heapification i = p ``` === "C++" ```cpp title="my_heap.cpp" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "Java" ```java title="my_heap.java" /* Push the element into heap */ void push(int val) { // Add node maxHeap.add(val); // Heapify from bottom to top siftUp(size() - 1); } /* Start heapifying node i, from bottom to top */ void siftUp(int i) { while (true) { // Get parent node of node i int p = parent(i); // When "crossing the root node" or "node does not need repair", end heapification if (p < 0 || maxHeap.get(i) <= maxHeap.get(p)) break; // Swap two nodes swap(i, p); // Loop upwards heapification i = p; } } ``` === "C#" ```csharp title="my_heap.cs" [class]{MaxHeap}-[func]{Push} [class]{MaxHeap}-[func]{SiftUp} ``` === "Go" ```go title="my_heap.go" [class]{maxHeap}-[func]{push} [class]{maxHeap}-[func]{siftUp} ``` === "Swift" ```swift title="my_heap.swift" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "JS" ```javascript title="my_heap.js" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "TS" ```typescript title="my_heap.ts" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "Dart" ```dart title="my_heap.dart" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "Rust" ```rust title="my_heap.rs" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{sift_up} ``` === "C" ```c title="my_heap.c" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "Kotlin" ```kotlin title="my_heap.kt" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` === "Ruby" ```ruby title="my_heap.rb" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{sift_up} ``` === "Zig" ```zig title="my_heap.zig" [class]{MaxHeap}-[func]{push} [class]{MaxHeap}-[func]{siftUp} ``` ### 4. Removing the top element from the heap The top element of the heap is the root node of the binary tree, that is, the first element of the list. If we directly remove the first element from the list, all node indexes in the binary tree would change, making it difficult to use heapify for repairs subsequently. To minimize changes in element indexes, we use the following steps. 1. Swap the top element with the bottom element of the heap (swap the root node with the rightmost leaf node). 2. After swapping, remove the bottom of the heap from the list (note, since it has been swapped, what is actually being removed is the original top element). 3. Starting from the root node, **perform heapify from top to bottom**. As shown in Figure 8-4, **the direction of "heapify from top to bottom" is opposite to "heapify from bottom to top"**. We compare the value of the root node with its two children and swap it with the largest child. Then repeat this operation until passing the leaf node or encountering a node that does not need to be swapped. === "<1>" ![Steps of removing the top element from the heap](heap.assets/heap_pop_step1.png){ class="animation-figure" } === "<2>" ![heap_pop_step2](heap.assets/heap_pop_step2.png){ class="animation-figure" } === "<3>" ![heap_pop_step3](heap.assets/heap_pop_step3.png){ class="animation-figure" } === "<4>" ![heap_pop_step4](heap.assets/heap_pop_step4.png){ class="animation-figure" } === "<5>" ![heap_pop_step5](heap.assets/heap_pop_step5.png){ class="animation-figure" } === "<6>" ![heap_pop_step6](heap.assets/heap_pop_step6.png){ class="animation-figure" } === "<7>" ![heap_pop_step7](heap.assets/heap_pop_step7.png){ class="animation-figure" } === "<8>" ![heap_pop_step8](heap.assets/heap_pop_step8.png){ class="animation-figure" } === "<9>" ![heap_pop_step9](heap.assets/heap_pop_step9.png){ class="animation-figure" } === "<10>" ![heap_pop_step10](heap.assets/heap_pop_step10.png){ class="animation-figure" }Figure 8-4 Steps of removing the top element from the heap
Similar to the element insertion operation, the time complexity of the top element removal operation is also $O(\log n)$. The code is as follows: === "Python" ```python title="my_heap.py" def pop(self) -> int: """Element exits heap""" # Empty handling if self.is_empty(): raise IndexError("Heap is empty") # Swap the root node with the rightmost leaf node (swap the first element with the last element) self.swap(0, self.size() - 1) # Remove node val = self.max_heap.pop() # Heapify from top to bottom self.sift_down(0) # Return heap top element return val def sift_down(self, i: int): """Start heapifying node i, from top to bottom""" while True: # Determine the largest node among i, l, r, noted as ma l, r, ma = self.left(i), self.right(i), i if l < self.size() and self.max_heap[l] > self.max_heap[ma]: ma = l if r < self.size() and self.max_heap[r] > self.max_heap[ma]: ma = r # If node i is the largest or indices l, r are out of bounds, no further heapification needed, break if ma == i: break # Swap two nodes self.swap(i, ma) # Loop downwards heapification i = ma ``` === "C++" ```cpp title="my_heap.cpp" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "Java" ```java title="my_heap.java" /* Element exits heap */ int pop() { // Empty handling if (isEmpty()) throw new IndexOutOfBoundsException(); // Swap the root node with the rightmost leaf node (swap the first element with the last element) swap(0, size() - 1); // Remove node int val = maxHeap.remove(size() - 1); // Heapify from top to bottom siftDown(0); // Return heap top element return val; } /* Start heapifying node i, from top to bottom */ void siftDown(int i) { while (true) { // Determine the largest node among i, l, r, noted as ma int l = left(i), r = right(i), ma = i; if (l < size() && maxHeap.get(l) > maxHeap.get(ma)) ma = l; if (r < size() && maxHeap.get(r) > maxHeap.get(ma)) ma = r; // If node i is the largest or indices l, r are out of bounds, no further heapification needed, break if (ma == i) break; // Swap two nodes swap(i, ma); // Loop downwards heapification i = ma; } } ``` === "C#" ```csharp title="my_heap.cs" [class]{MaxHeap}-[func]{Pop} [class]{MaxHeap}-[func]{SiftDown} ``` === "Go" ```go title="my_heap.go" [class]{maxHeap}-[func]{pop} [class]{maxHeap}-[func]{siftDown} ``` === "Swift" ```swift title="my_heap.swift" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "JS" ```javascript title="my_heap.js" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "TS" ```typescript title="my_heap.ts" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "Dart" ```dart title="my_heap.dart" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "Rust" ```rust title="my_heap.rs" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{sift_down} ``` === "C" ```c title="my_heap.c" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "Kotlin" ```kotlin title="my_heap.kt" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` === "Ruby" ```ruby title="my_heap.rb" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{sift_down} ``` === "Zig" ```zig title="my_heap.zig" [class]{MaxHeap}-[func]{pop} [class]{MaxHeap}-[func]{siftDown} ``` ## 8.1.3 Common applications of heaps - **Priority Queue**: Heaps are often the preferred data structure for implementing priority queues, with both enqueue and dequeue operations having a time complexity of $O(\log n)$, and building a queue having a time complexity of $O(n)$, all of which are very efficient. - **Heap Sort**: Given a set of data, we can create a heap from them and then continually perform element removal operations to obtain ordered data. However, we usually use a more elegant method to implement heap sort, as detailed in the "Heap Sort" section. - **Finding the Largest $k$ Elements**: This is a classic algorithm problem and also a typical application, such as selecting the top 10 hot news for Weibo hot search, picking the top 10 selling products, etc.