Compare commits

...

9 commits

3 changed files with 31 additions and 31 deletions

View file

@ -6,9 +6,9 @@
Given a sorted array `nums` of length $n$, which may contain duplicate elements, return the index of the leftmost element `target`. If the element is not present in the array, return $-1$.
Recall the method of binary search for an insertion point, after the search is completed, $i$ points to the leftmost `target`, **thus searching for the insertion point is essentially searching for the index of the leftmost `target`**.
Recall the method of binary search for an insertion point, after the search is completed, the index $i$ will point to the leftmost occurrence of `target`. Therefore, **searching for the insertion point is essentially the same as finding the index of the leftmost `target`**.
Consider implementing the search for the left boundary using the function for finding an insertion point. Note that the array might not contain `target`, which could lead to the following two results:
We can use the function for finding an insertion point to find the left boundary of `target`. Note that the array might not contain `target`, which could lead to the following two results:
- The index $i$ of the insertion point is out of bounds.
- The element `nums[i]` is not equal to `target`.
@ -21,27 +21,27 @@ In these cases, simply return $-1$. The code is as follows:
## Find the right boundary
So how do we find the rightmost `target`? The most straightforward way is to modify the code, replacing the pointer contraction operation in the case of `nums[m] == target`. The code is omitted here, but interested readers can implement it on their own.
How do we find the rightmost occurrence of `target`? The most straightforward way is to modify the traditional binary search logic by changing how we adjust the search boundaries in the case of `nums[m] == target`. The code is omitted here. If you are interested, try to implement the code on your own.
Below we introduce two more cunning methods.
Below we are going to introduce two more ingenious methods.
### Reusing the search for the left boundary
### Reuse the left boundary search
In fact, we can use the function for finding the leftmost element to find the rightmost element, specifically by **transforming the search for the rightmost `target` into a search for the leftmost `target + 1`**.
To find the rightmost occurrence of `target`, we can reuse the logic for finding the leftmost occurrence of `target`. Specifically, we can find the leftmost `target`, and then adjust the result to point to the rightmost `target` by simply adding 1 to the index of the leftmost `target`.
As shown in the figure below, after the search is completed, the pointer $i$ points to the leftmost `target + 1` (if it exists), while $j$ points to the rightmost `target`, **thus returning $j$ is sufficient**.
As shown in the figure below, after the search is complete, pointer $i$ will point to the position just after the leftmost `target` (i.e., `target + 1`), and pointer $j$ will point to the rightmost `target`. Therefore, returning $j$ will give us the right boundary.
![Transforming the search for the right boundary into the search for the left boundary](binary_search_edge.assets/binary_search_right_edge_by_left_edge.png)
Please note, the insertion point returned is $i$, therefore, it should be subtracted by $1$ to obtain $j$:
Note that the insertion point returned is $i$, therefore, it should be subtracted by $1$ to obtain $j$:
```src
[file]{binary_search_edge}-[class]{}-[func]{binary_search_right_edge}
```
### Transforming into an element search
### Transform into an element search
We know that when the array does not contain `target`, $i$ and $j$ will eventually point to the first element greater and smaller than `target` respectively.
When the array does not contain `target`, $i$ and $j$ will eventually point to the first element greater and smaller than `target` respectively.
Thus, as shown in the figure below, we can construct an element that does not exist in the array, to search for the left and right boundaries.
@ -50,7 +50,7 @@ Thus, as shown in the figure below, we can construct an element that does not ex
![Transforming the search for boundaries into the search for an element](binary_search_edge.assets/binary_search_edge_by_element.png)
The code is omitted here, but two points are worth noting.
The code is omitted here, but here are two important points to note about this approach.
- The given array does not contain decimals, meaning we do not need to worry about how to handle equal situations.
- Since this method introduces decimals, the variable `target` in the function needs to be changed to a floating point type (no change needed in Python).
- The given array `nums` does not contain decimal, so handling equal cases is not a concern.
- However, the introduction of decimal in this approach requires modifying the `target` variable to a floating-point type (no change needed in Python).

View file

@ -6,21 +6,21 @@ Binary search is not only used to search for target elements but also to solve m
!!! question
Given an ordered array `nums` of length $n$ and an element `target`, where the array has no duplicate elements. Now insert `target` into the array `nums` while maintaining its order. If the element `target` already exists in the array, insert it to its left side. Please return the index of `target` in the array after insertion. See the example shown in the figure below.
Given a sorted array `nums` of length $n$ with unique elements and an element `target`, insert `target` into `nums` while maintaining its sorted order. If `target` already exists in the array, insert it to the left of the existing element. Return the index of `target` in the array after insertion. See the example shown in the figure below.
![Example data for binary search insertion point](binary_search_insertion.assets/binary_search_insertion_example.png)
If you want to reuse the binary search code from the previous section, you need to answer the following two questions.
**Question one**: When the array contains `target`, is the insertion point index the index of that element?
**Question one**: If the array already contains `target`, would the insertion point be the index of existing element?
The requirement to insert `target` to the left of equal elements means that the newly inserted `target` replaces the original `target` position. Thus, **when the array contains `target`, the insertion point index is the index of that `target`**.
The requirement to insert `target` to the left of equal elements means that the newly inserted `target` will replace the original `target` position. In other words, **when the array contains `target`, the insertion point is indeed the index of that `target`**.
**Question two**: When the array does not contain `target`, what is the index of the insertion point?
**Question two**: When the array does not contain `target`, at which index would it be inserted?
Further consider the binary search process: when `nums[m] < target`, pointer $i$ moves, meaning that pointer $i$ is approaching an element greater than or equal to `target`. Similarly, pointer $j$ is always approaching an element less than or equal to `target`.
Let's further consider the binary search process: when `nums[m] < target`, pointer $i$ moves, meaning that pointer $i$ is approaching an element greater than or equal to `target`. Similarly, pointer $j$ is always approaching an element less than or equal to `target`.
Therefore, at the end of the binary, it is certain that: $i$ points to the first element greater than `target`, and $j$ points to the first element less than `target`. **It is easy to see that when the array does not contain `target`, the insertion index is $i$**. The code is as follows:
Therefore, at the end of the binary, it is certain that: $i$ points to the first element greater than `target`, and $j$ points to the first element less than `target`. **It is easy to see that when the array does not contain `target`, the insertion point is $i$**. The code is as follows:
```src
[file]{binary_search_insertion}-[class]{}-[func]{binary_search_insertion_simple}
@ -32,21 +32,21 @@ Therefore, at the end of the binary, it is certain that: $i$ points to the first
Based on the previous question, assume the array may contain duplicate elements, all else remains the same.
Suppose there are multiple `target`s in the array, ordinary binary search can only return the index of one of the `target`s, **and it cannot determine how many `target`s are to the left and right of that element**.
When there are multiple occurrences of `target` in the array, a regular binary search can only return the index of one occurrence of `target`, **and it cannot determine how many occurrences of `target` are to the left and right of that position**.
The task requires inserting the target element to the very left, **so we need to find the index of the leftmost `target` in the array**. Initially consider implementing this through the steps shown in the figure below.
The problem requires inserting the target element to the very left, **so we need to find the index of the leftmost `target` in the array**. Initially consider implementing this through the steps shown in the figure below.
1. Perform a binary search, get an arbitrary index of `target`, denoted as $k$.
2. Start from index $k$, and perform a linear search to the left until the leftmost `target` is found and return.
1. Perform a binary search to find any index of `target`, say $k$.
2. Starting from index $k$, conduct a linear traversal to the left until the leftmost occurrence of `target` is found, then return this index.
![Linear search for the insertion point of duplicate elements](binary_search_insertion.assets/binary_search_insertion_naive.png)
Although this method is feasible, it includes linear search, so its time complexity is $O(n)$. This method is inefficient when the array contains many duplicate `target`s.
Now consider extending the binary search code. As shown in the figure below, the overall process remains the same, each round first calculates the midpoint index $m$, then judges the size relationship between `target` and `nums[m]`, divided into the following cases.
Now consider extending the binary search code. As shown in the figure below, the overall process remains the same. In each round, we first calculate the middle index $m$, then compare the value of `target` with `nums[m]`, which results in the following cases.
- When `nums[m] < target` or `nums[m] > target`, it means `target` has not been found yet, thus use the normal binary search interval reduction operation, **thus making pointers $i$ and $j$ approach `target`**.
- When `nums[m] == target`, it indicates that the elements less than `target` are in the interval $[i, m - 1]$, therefore use $j = m - 1$ to narrow the interval, **thus making pointer $j$ approach elements less than `target`**.
- When `nums[m] < target` or `nums[m] > target`, it means `target` has not been found yet, thus use the normal binary search to narrow the search range, **bring pointers $i$ and $j$ closer to `target`**.
- When `nums[m] == target`, it indicates that the elements less than `target` are in the range $[i, m - 1]$, therefore use $j = m - 1$ to narrow the range, **thus bring pointer $j$ closer to the elements less than `target`**.
After the loop, $i$ points to the leftmost `target`, and $j$ points to the first element less than `target`, **therefore index $i$ is the insertion point**.
@ -74,9 +74,9 @@ After the loop, $i$ points to the leftmost `target`, and $j$ points to the first
=== "<8>"
![binary_search_insertion_step8](binary_search_insertion.assets/binary_search_insertion_step8.png)
Observe the code, the operations of the branch `nums[m] > target` and `nums[m] == target` are the same, so the two can be combined.
Observe the following code. The operations in the branches `nums[m] > target` and `nums[m] == target` are the same, so these two branches can be merged.
Even so, we can still keep the conditions expanded, as their logic is clearer and more readable.
Even so, we can still keep the conditions expanded, as it makes the logic clearer and improves readability.
```src
[file]{binary_search_insertion}-[class]{}-[func]{binary_search_insertion}
@ -84,8 +84,8 @@ Even so, we can still keep the conditions expanded, as their logic is clearer an
!!! tip
The code in this section uses "closed intervals". Readers interested can implement the "left-closed right-open" method themselves.
The code in this section uses "closed interval". If you are interested in "left-closed,right-open", try to implement the code on your own.
In summary, binary search is merely about setting search targets for pointers $i$ and $j$, which might be a specific element (like `target`) or a range of elements (like elements less than `target`).
In summary, binary search essentially involves setting search targets for pointers $i$ and $j$, which might be a specific element (like `target`) or a range of elements (like elements less than `target`).
In the continuous loop of binary search, pointers $i$ and $j$ gradually approach the predefined target. Ultimately, they either find the answer or stop after crossing the boundary.

View file

@ -117,7 +117,7 @@ nav:
# [icon: material/text-search]
- chapter_searching/index.md
- 10.1 Binary search: chapter_searching/binary_search.md
- 10.2 Binary search insertion: chapter_searching/binary_search_insertion.md
- 10.2 Binary search for insertion point: chapter_searching/binary_search_insertion.md
- 10.3 Binary search boundaries: chapter_searching/binary_search_edge.md
- 10.4 Hashing optimization strategies: chapter_searching/replace_linear_by_hashing.md
- 10.5 Search algorithms revisited: chapter_searching/searching_algorithm_revisited.md