I'm stuck on this one problem from a competitive programming archive which says:
Given an array of integers A of length n, we need to find the minimum length
subarray such that it has a subsequence with an xor sum of 0. (Subarray is a
contiguous segment of an array and subsequence is a sequence that can be
derived from a subarray by deleting zero or more elements.) The resulting
subarray must have a length of at least 1. (Empty subarray cannot be an answer.)
1 <= n <= 10^5
1 <= A[i] <= 10^9
Since the constraints are huge, the typical Dynamic Programming approach doesn't seem to work. I've read about doing Gaussian Elimination mod 2, but I don't see how it is applicable to this question. Since there can be exponential number of solutions to that System of Equations, picking the minimum length subarray shouldn't be so easy to do. I've also considered doing a Sliding Window approach but I don't know how adding the next number to our data structure and removing the first numbers can be supported. A bitset approach similar to subset sum problem wouldn't work either as well because numbers are up to 10^9 and we can't use up that much memory.
These are all the things that I tried to solve this problem. Any ideas and resources are welcome. If there is an efficient algorithm for this task, please let me know.
Take any length-30 subarray. Every A[i] < 230, so there are two possibilities:
In either case, the subarray plus the next element has a subsequence that xors to 0. The length of the smallest subarray with such a subsequence is therefore always <= 31, unless the array is short and has no such subsequence at all.
With the constraint that N < 105, you've got plenty of time to try Gaussian elimination on every subarray with length < 31.
Remember: you don't actually have to find a subsequence that xors to zero. You just need to use Gaussian elimination to see if the integers in the subarray are linearly independent.
Here's an example implementation in python (for clarity, even though python is slow):
def smallestSubarrayLength(A):
rows = [0]*32
bestLength = None
# add rows starting at start to a Gaussian elimination
for start in range(len(A)):
n=1
rows[0] = A[start]
for i in range(1,min(32,len(A)-start)):
newrow = A[start+i]
for j in range(n):
# rows will be in descending order,
# and each entry will have a different high bit
oldrow = rows[j]
if (newrow > oldrow):
rows[j] = newrow
newrow = min(oldrow, newrow ^ oldrow)
else:
newrow = min(newrow, newrow ^ oldrow)
rows[n] = newrow
n += 1
if newrow == 0:
# The rows are not linearly independent
if bestLength == None or n < bestLength:
bestLength = n
break
return bestLength