When I was learning binary search and saw other people’s implementation, I was confused, I’m sure you did as well. Let’s clear it up. Binary search in the first glance is used to find a value in a sorted sequence, you may write something like this:
def bs(arr, target)
lo, hi = 0, arr.size-1
while lo <= hi
mid = lo + (hi - lo) / 2
if arr[mid] == target
return mid
else if arr[mid] < target
lo = mid + 1
else
hi = mid - 1
end
// target not found
end
Easy right? However sometimes you need more than finding a target. Sometimes you may not have a sequence like an array. Let’s take it further, array lookup is essentially a function: given x, f(x) is the search target, and the equality check for finding the target is also just a predicate over f(x): given predicate p, we want to find the first x such that p(f(x)) is true:
def bs(f, p, lo, hi)
while lo < hi
mid = lo + (hi - lo) / 2
if p(f(mid))
hi = mid
else
lo = mid + 1
end
end
return f(lo) if p(f(lo))
end
The two crucial lines are hi = mid
and lo = mid+1
. When p(mid)
is true, we can discard the second half of the search space, since the predicate is true for all elements in it. However, we can not discard mid itself, since it may well be the first element for which p is true. In a similar vein, if p(mid)
is false, we can discard the first half of the search space, but this time including mid. p(mid)
is false so we don’t need it in our search space. This effectively means we can move the lower bound to mid+1. Also notice that we use lo < hi
stead of lo <= hi
.
For an example, see Leetcode problem 74:
def search_matrix_a(matrix, target)
return false if matrix[0].nil? || matrix[0].empty?
sub = bsearch(matrix) { |sub| target <= sub[-1] }
found = bsearch(sub) { |x| x >= target }
!found.nil? && found == target
end
def bsearch(objs)
return if objs.nil? || objs.empty?
lo, hi = 0, objs.length-1
while lo < hi
mid = lo + (hi-lo) / 2
if yield(objs[mid])
hi = mid
else
lo = mid + 1
end
end
return objs[lo] if yield(objs[lo])
end
A similar problem is Leetcode 378. Given a matrix: [[1, 5, 9], [10, 11, 13], [12, 13, 15]]. When k=8
, mid=(1+15)/2=8
, for each row we find the index of the first element that is larger than mid. Then sum up all the indices and compare it with k: 2+0+0 < k
, which means mid is too big, we need to go ‘left’. If it equals k, it means that we have k numbers that smaller or equal than that number, which is exactly what we want – the kth smallest number. Here we use ruby’s build-in bsearch
method.
def kth_smallest(matrix, k)
lo, hi = matrix[0][0], matrix[-1][-1]
(lo..hi).bsearch do |m|
matrix.map do |x|
(0..x.length).bsearch { |i| i == x.length || x[i] > m }
end.sum >= k
end
end
For a more interesting problem see leetcode 1014, we want to find the least weight capacity(the search space) of the ship such that packages can be shipped within D days(the predicate):
def ship_within_days(weights, d)
lo, hi = weights.max, weights.sum
(lo..hi).bsearch do |load|
days_needed(weights, load) <= d
end
end
def days_needed(weights, load)
current_load, days= 0, 1
weights.each do |w|
current_load += w
if current_load > load
days += 1
current_load = w
end
end
days
end
In conclusion, binary search is much more than something for searching elements in a sorted array. Just keep these steps in mind:
- Find the search space and determine the lower and upper bounds.
- Design the predicate.
- Deal with edge cases.
- Write and test your binary search.