algorithmdata-structuresxor

Minimum length subarray such that it has a subsequence with an xor sum of 0


I'm stuck on this one problem from a competitive programming archive which says:

Given an array of integers A of length n, we need to find the minimum length
subarray such that it has a subsequence with an xor sum of 0. (Subarray is a
contiguous segment of an array and subsequence is a sequence that can be
derived from a subarray by deleting zero or more elements.) The resulting
subarray must have a length of at least 1. (Empty subarray cannot be an answer.)

1 <= n <= 10^5
1 <= A[i] <= 10^9

Since the constraints are huge, the typical Dynamic Programming approach doesn't seem to work. I've read about doing Gaussian Elimination mod 2, but I don't see how it is applicable to this question. Since there can be exponential number of solutions to that System of Equations, picking the minimum length subarray shouldn't be so easy to do. I've also considered doing a Sliding Window approach but I don't know how adding the next number to our data structure and removing the first numbers can be supported. A bitset approach similar to subset sum problem wouldn't work either as well because numbers are up to 10^9 and we can't use up that much memory.

These are all the things that I tried to solve this problem. Any ideas and resources are welcome. If there is an efficient algorithm for this task, please let me know.


Solution

  • Take any length-30 subarray. Every A[i] < 230, so there are two possibilities:

    1. The 30 integers are linearly (with xor) independent, in which case you can make any 30-bit integer, including whichever one is next in the array, by xoring together the appropriate subsequence; or
    2. The 30 integers are not linearly independent, in which case you can make 0 by xoring together the appropriate non-empty subsequence.

    In either case, the subarray plus the next element has a subsequence that xors to 0. The length of the smallest subarray with such a subsequence is therefore always <= 31, unless the array is short and has no such subsequence at all.

    With the constraint that N < 105, you've got plenty of time to try Gaussian elimination on every subarray with length < 31.

    Remember: you don't actually have to find a subsequence that xors to zero. You just need to use Gaussian elimination to see if the integers in the subarray are linearly independent.

    Here's an example implementation in python (for clarity, even though python is slow):

    def smallestSubarrayLength(A):
        rows = [0]*32
        bestLength = None
    
        # add rows starting at start to a Gaussian elimination
        for start in range(len(A)):
            n=1
            rows[0] = A[start]
            for i in range(1,min(32,len(A)-start)):
                newrow = A[start+i]
                for j in range(n):
                    # rows will be in descending order,
                    # and each entry will have a different high bit
                    oldrow = rows[j]
                    if (newrow > oldrow):
                        rows[j] = newrow
                        newrow = min(oldrow, newrow ^ oldrow)
                    else:
                        newrow = min(newrow, newrow ^ oldrow)
                rows[n] = newrow
                n += 1
                if newrow == 0:
                    # The rows are not linearly independent
                    if bestLength == None or n < bestLength:
                        bestLength = n
                    break
        return bestLength