c++algorithmdata-structuresbit-manipulationarray-algorithms

Given an array find all the elements that appear thrice except one element appearing once: how does the optimal approach using bit-manipulation work


I came across LeetCode problem 137. Single Number II:

Given an integer array nums where every element appears three times except for one, which appears exactly once. Find the single element and return it.

You must implement a solution with a linear runtime complexity and use only constant extra space.

Example 1:

Input: nums = [2, 2, 3, 2]
Output: 3

Example 2:

Input: nums = [0,1,0,1,0,1,99]
Output: 99

Constraints:

  • 1 <= nums.length <= 3 * 104
  • āˆ’231 <= nums[i] <= 231āˆ’1

Solution using bit-manipulation

I came across an efficient solution using bit-manipulation. It has a time complexity of O(n), and a space complexity of O(1):

class Solution {
public:
    int singleNumber(vector<int>& nums) {        
        // Two buckets
        int ones = 0, twos = 0;
        // Traverse the array 
        for(int i=0; i < nums.size(); i++) {
            // Add the number to Ones, it is not in Twos
            ones = (ones ^ nums[i]) & ~twos;
            // Add the number to Twos, if it is already in Ones
            twos = (twos ^ nums[i]) & ~ones;
        }
        return ones;
    }
};

It uses a concept of buckets, i.e. grouping elements as you iterate through the array and putting them into their corresponding buckets, such that when you come across the first occurrence of an element it will go into the first bucket, and on the second occurrence it goes into the second bucket while removing it from the first.

Question

The above insight is as far as I got -- I don't fully understand the logic behind that algorithm.

Suppose there's a number 3 that we come across in the array, and initially ones=0, twos=0. Then for its first occurrence we would get:

((0^nums[i]) & ~twos)

which would put this element into the ones bucket.

Now suppose we come across the second occurrence of an element 3 in the array. Then we get the following:

 ((twos^nums[i]) & ~ones)

where twos=0, nums[i]=3.

Problem: If it's present in the array wouldn't the corresponding bit in the ones bucket be set to one?

The negation of ones would give 0

~1=0

And we know for any number x:

(x & 0)==0

Since any number ANDED with F gives F, wouldn't that mean that on the second occurrence the number 3 is not added to the twos bucket?

What am I missing in my understanding of the problem?

Note that I don't look for yet another solution to the problem itself, I just want to fully understand the bit-manipulation algorithm.


Solution

  • Here are some observations:

    This means that if we can solve this for one bit (like 20), we can do it for all.

    Let's do it just for one specific bit: the value 1. The code starts to make more sense as we consider nums[i] to be just 1, and ones and twos are by consequence either 0 or 1 (the two possibilities for that particular bit).

    The assignment to ones will be 1 if twos is 0 (its initial value). If however, twos is 1, then ones remains unchanged (0). Note that ~1 is a bit pattern that has all bits set to 1, except the least-significant bit.

    The assignment to twos will be 0 if ones is 1, but will be 1 if ones was in the mean time back to 0. The latter will occur at a second occurrence of our value.

    At the third occurrence of the value, both ones and twos will be back to 0, i.e. their initial values.

    This sequence of changes at each processing of a 1-value, leads to the following cycle that has a period of 3:

    occurrence of 1 effect on ones effect on twos
    1st 1 0
    2nd 0 1
    3rd 0 0
    4th 1 0
    5th 0 1
    6th 0 0
    ... ... ...

    We can see that ones is 1 when we have one more occurrence than a multiple of 3. This is exactly what we need to know.

    Finally, as we do this for all bit positions in parallel, the 1-bits that are found in ones represent the 1-bits of the number that occurred just once.

    I hope this clarifies it.