A summary of this video.
Solution bags is an alternative way of thinking about dynamic programming that yields itself to optimizations using data structures. Using solution bag we can solve the shortest subarray with sum at least k problem with O(N)
time complexity.
Here is the problem description: return the length of the shortest, non-empty, contiguous subarray of an array a
with sum at least k
. If a
only contains positive number then this problem is quite simple which can be solved in O(N)
time using two pointers technique, but it’s not the case here, a
contains negative number. Then the O(N^2)
solution is obvious: just iterate through all the subarrays and take advantage of precomputations which in this case is the prefix sum. If we want to optimize it further, we must think differently.
Let’s first define what a bag is, you can think of a bag as a space which contains the answer to our problem, the space could be ordered like a stack/queue, or it could be a generic ‘bag’ which hold no special meaning yet. In this problem I’ll define a bag as follow: for a subarray a
of size m+1
, the bag b
is: {[0,a[0]+...+a[m]], ..., [i,a[i]+...+a[m]], ..., [k,a[m]]}
. In this single bag we can easily find the minimum length which makes a[i]+...+a[m]>=k
(or sum(i,m)>=k
for short). When adding a new item x
into a
, we only need to increment the bag with this new item x
to get the next bag: b[i][1]+=x for all i
. Given an array we can generate all the bags by adding one item at a time, and for each bag we have {sum(0,j),...,sum(i,j),...,sum(j,j)}
, and we can find the minimum length which makes sum(i,j)>=k
.
So far the method above using our fancy bags is still O(N^2)
, nothing special. But if for each bag we can find some invariant to reduce its size, we can have better time complexity overall:
- If
sum(i1,j)<sum(i2,j)
andi1<i2
then there is no point of keepingsum(i1,j)
becausesum(i2,j
) is closer to our goal. In other words, we have this invariant: ifi1<i2
, thensum(i1,j)>sum(i2,j)
. - If we find that
sum(i,j)>=k
, then there is no point of keeping it in the bag for later use because extendingj
makesj-i
bigger and we already havesum(i, j)>=k
.
With these two invariants we can achieve much better time complexity already, there is a slight problem though: to get the next bag we have to do b[i][1]+=x for all i
, and that is pretty slow. To get rid of that we use a technique called delting. We keep an auxiliary variable delta
outside of the bag to represent the overall change to the bag. Each time we take something out of the bag we add delta
to that value, and each time before we put something in the bag, we subtract delta
from that value.
def shortest_subarray(a, k)
delta, b, min = 0, [], Float::INFINITY
a.size.times do |j|
delta += a[j]
b.pop while !b.empty? && b[-1][1]+delta < a[j] # first invariant
b << [j, a[j]-delta]
last = b.shift while !b.empty? && b[0][1]+delta >= k # second invariant
min = [min, j-last[0]+1].min if last
end
min == Float::INFINITY ? -1 : min
end
For each item, we add it to the bag once and remove from bag once, so the total time complexity is O(N)
.
Let’s use solution bag to solve an easy problem: Sliding Window Maximum. This time we’ll keep a bag of monotonically decreasing numbers.
- If
i<j
andnums[i]<nums[j]
, then no matter what windownums[i]
is in, it won’t contribute to the maximum number inside that window. In other words we needi<j
&&nums[i]>=nums[j]
. - We need to remove the first number from the bag while sliding to the right if it equals to the maximum number in the bag.
def max_sliding_window(nums, k)
q, ans, i = [], [], 0
while i < nums.size
q.pop while !q.empty? && q[-1] < nums[i] # first invariant
q << nums[i]
q.shift if i >= k && nums[i-k] == q[0] # second invariant
ans << q[0] if i >= k-1
i += 1
end
ans
end
Anyway, I hope you get the idea.