During a coding challenge I encountered a variation of Leetcode 209. Minimum Size Subarray Sum, where the array contains not only positive numbers but negative ones, in which case, sliding window solution is no longer applicable as the structure is no longer monotonic, and I read that prefix sum with bisect can be used to solve it by complexity of O(nlogn).
However, I failed to figure out where can bisect come into play. If I make no mistake, our goal is to, for each index i, find the largest j >= i, such that sum( nums[j: i + 1] >= target )
, we can translate it into prefixSum[i] - prefixSum[j] >= target
hence get a constant overhead, but bisect how? shouldn't bisect ask for a monotonic structure too? I am totally lost here albeit think really hard.
Please note this question is not a duplicate to this question, where the array contains only positive values, i.e., the original Leetcode question.
UPDATE:
I managed to find an example below and did some tests using randomly generated testcases and found it might be correct:
import bisect
def shortest_subarray_sum_at_least_k(nums, target):
n = len(nums)
prefix = [0] * (n + 1)
for i in range(n):
prefix[i + 1] = prefix[i] + nums[i]
min_len = float('inf')
sorted_prefix = []
for i in range(n + 1):
target_prefix = prefix[i] - target
# Find the largest j where prefix[j] <= target_prefix
left = 0
right = len(sorted_prefix)
while left < right:
mid = (left + right) // 2
if sorted_prefix[mid][0] <= target_prefix:
left = mid + 1
else:
right = mid
if left > 0:
j = sorted_prefix[left - 1][1]
min_len = min(min_len, i - j)
# Insert current prefix[i] into sorted_prefix
# To maintain sorted order and keep track of indices
# We can use a list of tuples (prefix_val, index)
bisect.insort(sorted_prefix, (prefix[i], i))
return min_len if min_len != float('inf') else -1
Shamingly I still cannot get how does it work, in particular, I don't get the usage of sorted_prefix
which I believe is the core of the solution that enables bisection. Could anyone kindly teach what is the logic behind it?
As noted, the "solution" that you found, isn't.
The trick you need is to keep track of the points in the array where the prefix sum was smaller than it has been since. This list can be maintained as you go through the list at a cost of O(1)
per element. You can search this list in O(log(n))
to find the shortest subarray ending at this element that meets the target sum condition. And then maintain a minimum of that to get your answer in O(n log(n))
.
Here is hopefully working code for this algorithm.
import bisect
def shortest_subarray_sum_exceeds_k(nums, k):
n = len(nums)
# Step 1: Compute prefix sums
prefix = [0]
for num in nums:
prefix.append(prefix[-1] + num)
# Step 2: Maintain two parallel lists for stack
prefix_sums = [] # strictly increasing
indices = [] # corresponding indices
min_len = float('inf')
for j, current_sum in enumerate(prefix):
# Binary search to find the earliest i where prefix_sum < current_sum - k
target = current_sum - k
i = bisect.bisect_left(prefix_sums, target + 1)
if i > 0:
min_len = min(min_len, j - indices[i - 1])
# Maintain increasing order of prefix_sums
while prefix_sums and prefix_sums[-1] >= current_sum:
prefix_sums.pop()
indices.pop()
prefix_sums.append(current_sum)
indices.append(j)
return min_len if min_len != float('inf') else -1