Seperate the build_heap from the heap.

2024-12-25 01:16:31 +08:00 · 2023-02-26 20:12:17 +08:00 · 2023-02-26 20:12:17 +08:00 · 5b44ff5397
commit 5b44ff5397
parent 23cda5e225
3 changed files with 120 additions and 108 deletions
--- a/docs/chapter_heap/build_heap.md
+++ b/docs/chapter_heap/build_heap.md
@ -0,0 +1,119 @@
+# 建堆操作 *
+
+如果我们想要根据输入列表来生成一个堆，这样的操作被称为「建堆」。
+
+## 两种建堆方法
+
+### 借助入堆方法实现
+
+最直接地，考虑借助「元素入堆」方法，先建立一个空堆，**再将列表元素依次入堆即可**。
+
+### 基于堆化操作实现
+
+然而，**存在一种更加高效的建堆方法**。设结点数量为 $n$ ，我们先将列表所有元素原封不动添加进堆，**然后迭代地对各个结点执行「从顶至底堆化」**。当然，**无需对叶结点执行堆化**，因为其没有子结点。
+
+=== "Java"
+
+    ```java title="my_heap.java"
+    [class]{MaxHeap}-[func]{MaxHeap}
+    ```
+
+=== "C++"
+
+    ```cpp title="my_heap.cpp"
+    [class]{MaxHeap}-[func]{MaxHeap}
+    ```
+
+=== "Python"
+
+    ```python title="my_heap.py"
+    [class]{MaxHeap}-[func]{__init__}
+    ```
+
+=== "Go"
+
+    ```go title="my_heap.go"
+    [class]{maxHeap}-[func]{newMaxHeap}
+    ```
+
+=== "JavaScript"
+
+    ```javascript title="my_heap.js"
+    [class]{MaxHeap}-[func]{constructor}
+    ```
+
+=== "TypeScript"
+
+    ```typescript title="my_heap.ts"
+    [class]{MaxHeap}-[func]{constructor}
+    ```
+
+=== "C"
+
+    ```c title="my_heap.c"
+    [class]{maxHeap}-[func]{newMaxHeap}
+    ```
+
+=== "C#"
+
+    ```csharp title="my_heap.cs"
+    [class]{MaxHeap}-[func]{MaxHeap}
+    ```
+
+=== "Swift"
+
+    ```swift title="my_heap.swift"
+    [class]{MaxHeap}-[func]{init}
+    ```
+
+=== "Zig"
+
+    ```zig title="my_heap.zig"
+    [class]{MaxHeap}-[func]{init}
+    ```
+
+## 复杂度分析
+
+对于第一种建堆方法，元素入堆的时间复杂度为 $O(\log n)$ ，而平均长度为 $\frac{n}{2}$ ，因此该方法的总体时间复杂度为 $O(n \log n)$ 。
+
+那么，第二种建堆方法的时间复杂度是多少呢？我们来展开推算一下。
+
+- 完全二叉树中，设结点总数为 $n$ ，则叶结点数量为 $(n + 1) / 2$ ，其中 $/$ 为向下整除。因此在排除叶结点后，需要堆化结点数量为 $(n - 1)/2$ ，即为 $O(n)$ ；
+- 从顶至底堆化中，每个结点最多堆化至叶结点，因此最大迭代次数为二叉树高度 $O(\log n)$ ；
+
+将上述两者相乘，可得时间复杂度为 $O(n \log n)$ 。然而，该估算结果仍不够准确，因为我们没有考虑到 **二叉树底层结点远多于顶层结点** 的性质。
+
+下面我们来尝试展开计算。为了减小计算难度，我们假设树是一个「完美二叉树」，该假设不会影响计算结果的正确性。设二叉树（即堆）结点数量为 $n$ ，树高度为 $h$ 。上文提到，**结点堆化最大迭代次数等于该结点到叶结点的距离，而这正是“结点高度”**。因此，我们将各层的“结点数量 $\times$ 结点高度”求和，即可得到所有结点的堆化的迭代次数总和。
+
+$$
+T(h) = 2^0h + 2^1(h-1) + 2^2(h-2) + \cdots + 2^{(h-1)}\times1
+$$
+
+![完美二叉树的各层结点数量](heap.assets/heapify_operations_count.png)
+
+化简上式需要借助中学的数列知识，先对 $T(h)$ 乘以 $2$ ，易得
+
+$$
+\begin{aligned}
+T(h) & = 2^0h + 2^1(h-1) + 2^2(h-2) + \cdots + 2^{h-1}\times1 \newline
+2 T(h) & = 2^1h + 2^2(h-1) + 2^3(h-2) + \cdots + 2^{h}\times1 \newline
+\end{aligned}
+$$
+
+**使用错位相减法**，令下式 $2 T(h)$ 减去上式 $T(h)$ ，可得
+
+$$
+2T(h) - T(h) = T(h) = -2^0h + 2^1 + 2^2 + \cdots + 2^{h-1} + 2^h
+$$
+
+观察上式，$T(h)$ 是一个等比数列，可直接使用求和公式，得到时间复杂度为
+
+$$
+\begin{aligned}
+T(h) & = 2 \frac{1 - 2^h}{1 - 2} - h \newline
+& = 2^{h+1} - h \newline
+& = O(2^h)
+\end{aligned}
+$$
+
+进一步地，高度为 $h$ 的完美二叉树的结点数量为 $n = 2^{h+1} - 1$ ，易得复杂度为 $O(2^h) = O(n)$。以上推算表明，**输入列表并建堆的时间复杂度为 $O(n)$ ，非常高效**。
--- a/docs/chapter_heap/heap.md
+++ b/docs/chapter_heap/heap.md
@ -708,114 +708,6 @@
    [class]{MaxHeap}-[func]{siftDown}
    ```

-### 输入数据并建堆 *
-
-如果我们想要直接输入一个列表并将其建堆，那么该怎么做呢？最直接地，考虑使用「元素入堆」方法，将列表元素依次入堆。元素入堆的时间复杂度为 $O(\log n)$ ，而平均长度为 $\frac{n}{2}$ ，因此该方法的总体时间复杂度为 $O(n \log n)$ 。
-
-然而，存在一种更加优雅的建堆方法。设结点数量为 $n$ ，我们先将列表所有元素原封不动添加进堆，**然后迭代地对各个结点执行「从顶至底堆化」**。当然，**无需对叶结点执行堆化**，因为其没有子结点。
-
-=== "Java"
-
-    ```java title="my_heap.java"
-    [class]{MaxHeap}-[func]{MaxHeap}
-    ```
-
-=== "C++"
-
-    ```cpp title="my_heap.cpp"
-    [class]{MaxHeap}-[func]{MaxHeap}
-    ```
-
-=== "Python"
-
-    ```python title="my_heap.py"
-    [class]{MaxHeap}-[func]{__init__}
-    ```
-
-=== "Go"
-
-    ```go title="my_heap.go"
-    [class]{maxHeap}-[func]{newMaxHeap}
-    ```
-
-=== "JavaScript"
-
-    ```javascript title="my_heap.js"
-    [class]{MaxHeap}-[func]{constructor}
-    ```
-
-=== "TypeScript"
-
-    ```typescript title="my_heap.ts"
-    [class]{MaxHeap}-[func]{constructor}
-    ```
-
-=== "C"
-
-    ```c title="my_heap.c"
-    [class]{maxHeap}-[func]{newMaxHeap}
-    ```
-
-=== "C#"
-
-    ```csharp title="my_heap.cs"
-    [class]{MaxHeap}-[func]{MaxHeap}
-    ```
-
-=== "Swift"
-
-    ```swift title="my_heap.swift"
-    [class]{MaxHeap}-[func]{init}
-    ```
-
-=== "Zig"
-
-    ```zig title="my_heap.zig"
-    [class]{MaxHeap}-[func]{init}
-    ```
-
-那么，第二种建堆方法的时间复杂度时多少呢？我们来做一下简单推算。
-
- 完全二叉树中，设结点总数为 $n$ ，则叶结点数量为 $(n + 1) / 2$ ，其中 $/$ 为向下整除。因此在排除叶结点后，需要堆化结点数量为 $(n - 1)/2$ ，即为 $O(n)$ ；
- 从顶至底堆化中，每个结点最多堆化至叶结点，因此最大迭代次数为二叉树高度 $O(\log n)$ ；
-
-将上述两者相乘，可得时间复杂度为 $O(n \log n)$ 。然而，该估算结果仍不够准确，因为我们没有考虑到 **二叉树底层结点远多于顶层结点** 的性质。
-
-下面我们来尝试展开计算。为了减小计算难度，我们假设树是一个「完美二叉树」，该假设不会影响计算结果的正确性。设二叉树（即堆）结点数量为 $n$ ，树高度为 $h$ 。上文提到，**结点堆化最大迭代次数等于该结点到叶结点的距离，而这正是“结点高度”**。因此，我们将各层的“结点数量 $\times$ 结点高度”求和，即可得到所有结点的堆化的迭代次数总和。
-
-$$
-T(h) = 2^0h + 2^1(h-1) + 2^2(h-2) + \cdots + 2^{(h-1)}\times1
-$$
-
-![完美二叉树的各层结点数量](heap.assets/heapify_operations_count.png)
-
-化简上式需要借助中学的数列知识，先对 $T(h)$ 乘以 $2$ ，易得
-
-$$
-\begin{aligned}
-T(h) & = 2^0h + 2^1(h-1) + 2^2(h-2) + \cdots + 2^{h-1}\times1 \newline
-2 T(h) & = 2^1h + 2^2(h-1) + 2^3(h-2) + \cdots + 2^{h}\times1 \newline
-\end{aligned}
-$$
-
-**使用错位相减法**，令下式 $2 T(h)$ 减去上式 $T(h)$ ，可得
-
-$$
-2T(h) - T(h) = T(h) = -2^0h + 2^1 + 2^2 + \cdots + 2^{h-1} + 2^h
-$$
-
-观察上式，$T(h)$ 是一个等比数列，可直接使用求和公式，得到时间复杂度为
-
-$$
-\begin{aligned}
-T(h) & = 2 \frac{1 - 2^h}{1 - 2} - h \newline
-& = 2^{h+1} - h \newline
-& = O(2^h)
-\end{aligned}
-$$
-
-进一步地，高度为 $h$ 的完美二叉树的结点数量为 $n = 2^{h+1} - 1$ ，易得复杂度为 $O(2^h) = O(n)$。以上推算表明，**输入列表并建堆的时间复杂度为 $O(n)$ ，非常高效**。
-
 ## 堆常见应用

 - **优先队列**。堆常作为实现优先队列的首选数据结构，入队和出队操作时间复杂度为 $O(\log n)$ ，建队操作为 $O(n)$ ，皆非常高效。
--- a/mkdocs.yml
+++ b/mkdocs.yml
@ -164,6 +164,7 @@ nav:
    - 7.5. &nbsp; 小结: chapter_tree/summary.md
  - 8. &nbsp; &nbsp; 堆:
    - 8.1. &nbsp; 堆（Heap）: chapter_heap/heap.md
+    - 8.2. &nbsp; 建堆操作 *: chapter_heap/build_heap.md
  - 9. &nbsp; &nbsp; 图:
    - 9.1. &nbsp; 图（Graph）: chapter_graph/graph.md
    - 9.2. &nbsp; 图基础操作: chapter_graph/graph_operations.md