7.1 KiB
comments |
---|
true |
8.3 Top-k problem
!!! question
Given an unordered array `nums` of length $n$, return the largest $k$ elements in the array.
For this problem, we will first introduce two straightforward solutions, then explain a more efficient heap-based method.
8.3.1 Method 1: Iterative selection
We can perform k
rounds of iterations as shown in Figure 8-6, extracting the 1^{st}
, 2^{nd}
, \dots
, k^{th}
largest elements in each round, with a time complexity of O(nk)
.
This method is only suitable when k \ll n
, as the time complexity approaches O(n^2)
when k
is close to n
, which is very time-consuming.
Figure 8-6 Iteratively finding the largest k elements
!!! tip
When $k = n$, we can obtain a complete ordered sequence, which is equivalent to the "selection sort" algorithm.
8.3.2 Method 2: Sorting
As shown in Figure 8-7, we can first sort the array nums
and then return the last k
elements, with a time complexity of O(n \log n)
.
Clearly, this method "overachieves" the task, as we only need to find the largest k
elements, without the need to sort the other elements.
Figure 8-7 Sorting to find the largest k elements
8.3.3 Method 3: Heap
We can solve the Top-k problem more efficiently based on heaps, as shown in the following process.
- Initialize a min heap, where the top element is the smallest.
- First, insert the first
k
elements of the array into the heap. - Starting from the
k + 1^{th}
element, if the current element is greater than the top element of the heap, remove the top element of the heap and insert the current element into the heap. - After completing the traversal, the heap contains the largest
k
elements.
=== "<1>" { class="animation-figure" }
=== "<2>" { class="animation-figure" }
=== "<3>" { class="animation-figure" }
=== "<4>" { class="animation-figure" }
=== "<5>" { class="animation-figure" }
=== "<6>" { class="animation-figure" }
=== "<7>" { class="animation-figure" }
=== "<8>" { class="animation-figure" }
=== "<9>" { class="animation-figure" }
Figure 8-8 Find the largest k elements based on heap
Example code is as follows:
=== "Python"
```python title="top_k.py"
def top_k_heap(nums: list[int], k: int) -> list[int]:
"""Using heap to find the largest k elements in an array"""
# Initialize min-heap
heap = []
# Enter the first k elements of the array into the heap
for i in range(k):
heapq.heappush(heap, nums[i])
# From the k+1th element, keep the heap length as k
for i in range(k, len(nums)):
# If the current element is larger than the heap top element, remove the heap top element and enter the current element into the heap
if nums[i] > heap[0]:
heapq.heappop(heap)
heapq.heappush(heap, nums[i])
return heap
```
=== "C++"
```cpp title="top_k.cpp"
/* Using heap to find the largest k elements in an array */
priority_queue<int, vector<int>, greater<int>> topKHeap(vector<int> &nums, int k) {
// Initialize min-heap
priority_queue<int, vector<int>, greater<int>> heap;
// Enter the first k elements of the array into the heap
for (int i = 0; i < k; i++) {
heap.push(nums[i]);
}
// From the k+1th element, keep the heap length as k
for (int i = k; i < nums.size(); i++) {
// If the current element is larger than the heap top element, remove the heap top element and enter the current element into the heap
if (nums[i] > heap.top()) {
heap.pop();
heap.push(nums[i]);
}
}
return heap;
}
```
=== "Java"
```java title="top_k.java"
/* Using heap to find the largest k elements in an array */
Queue<Integer> topKHeap(int[] nums, int k) {
// Initialize min-heap
Queue<Integer> heap = new PriorityQueue<Integer>();
// Enter the first k elements of the array into the heap
for (int i = 0; i < k; i++) {
heap.offer(nums[i]);
}
// From the k+1th element, keep the heap length as k
for (int i = k; i < nums.length; i++) {
// If the current element is larger than the heap top element, remove the heap top element and enter the current element into the heap
if (nums[i] > heap.peek()) {
heap.poll();
heap.offer(nums[i]);
}
}
return heap;
}
```
=== "C#"
```csharp title="top_k.cs"
[class]{top_k}-[func]{TopKHeap}
```
=== "Go"
```go title="top_k.go"
[class]{}-[func]{topKHeap}
```
=== "Swift"
```swift title="top_k.swift"
[class]{}-[func]{topKHeap}
```
=== "JS"
```javascript title="top_k.js"
[class]{}-[func]{pushMinHeap}
[class]{}-[func]{popMinHeap}
[class]{}-[func]{peekMinHeap}
[class]{}-[func]{getMinHeap}
[class]{}-[func]{topKHeap}
```
=== "TS"
```typescript title="top_k.ts"
[class]{}-[func]{pushMinHeap}
[class]{}-[func]{popMinHeap}
[class]{}-[func]{peekMinHeap}
[class]{}-[func]{getMinHeap}
[class]{}-[func]{topKHeap}
```
=== "Dart"
```dart title="top_k.dart"
[class]{}-[func]{topKHeap}
```
=== "Rust"
```rust title="top_k.rs"
[class]{}-[func]{top_k_heap}
```
=== "C"
```c title="top_k.c"
[class]{}-[func]{pushMinHeap}
[class]{}-[func]{popMinHeap}
[class]{}-[func]{peekMinHeap}
[class]{}-[func]{getMinHeap}
[class]{}-[func]{topKHeap}
```
=== "Kotlin"
```kotlin title="top_k.kt"
[class]{}-[func]{topKHeap}
```
=== "Ruby"
```ruby title="top_k.rb"
[class]{}-[func]{top_k_heap}
```
=== "Zig"
```zig title="top_k.zig"
[class]{}-[func]{topKHeap}
```
A total of n
rounds of heap insertions and deletions are performed, with the maximum heap size being k
, hence the time complexity is O(n \log k)
. This method is very efficient; when k
is small, the time complexity tends towards O(n)
; when k
is large, the time complexity will not exceed O(n \log n)
.
Additionally, this method is suitable for scenarios with dynamic data streams. By continuously adding data, we can maintain the elements within the heap, thereby achieving dynamic updates of the largest k
elements.