I'm trying to understand why the following solution for LeetCode's Single Number II works in Python:
class Solution:
def singleNumber(self, nums: List[int]) -> int:
number = 0
for i in range(32):
count = 0
for num in nums:
if num & (1 << i):
count += 1
if count % 3:
number |= (1 << i)
if number & (1 << 31):
number -= (1 << 32)
return number
But I'm confused about a few things:
In Python, integers are arbitrary precision, so they're not stored in 32-bit like in C or C++. sys.getsizeof() even shows 28 bytes for small integers. So how can we assume that the number fits in 32 bits and that bit 1 << 31 is actually the sign bit?
Why do we loop only from i in range(32)? Why not less
If I input small integers (like 2, 3, etc.), they don't "fill" 32 bits — so how can checking the 31st bit be reliable?
Basically, since Python ints grow as needed and aren’t stored in 32-bit containers, how does this approach still work correctly when simulating 32-bit signed behavior?
I tried understanding similar questions and answers (like on CodeReview.SE), and I get the general idea — Python ints are arbitrary precision and we simulate 32-bit behavior using bitmasking and shifting. But I'm still confused why this actually works reliably in Python.
My Questions:
Why can we safely assume a 32-bit simulation works in Python?
Why is checking the 31st bit (1 << 31) meaningful in Python?
Why doesn’t the arbitrary-size nature of Python integers break this logic?
Think of positive ints having infinitely many zero-bits in front and of negative ints having infinitely many one-bits in front:
3 = ...00000011
2 = ...00000010
1 = ...00000001
0 = ...00000000
-1 = ...11111111
-2 = ...11111110
-3 = ...11111101
-4 = ...11111100
That's how Python treats them, which is why for example (-3) & (1<<1000)
is 21000.
LeetCode's problem has numbers in the range -2^31 to 2^31-1. Let's say the 31 were 2 instead. Then the above -4 to 3 would be all possible values. You can see that for each number, all but the last two bits are always the same. Either they're all 0, or they're all 1. So it suffices to just find out the answer number's last three bits and duplicate the third-to-last bit to the infinite bits before.
Let's say the correct answer for some input were -3. Then we'd first find that the last three bits are 101
. Which represents the number 5. Then we check whether the third-to-last is 1
. Since it is, we subtract 23=8 and end up with 5-8 = -3.
With LeetCode's limit using 31, it likewise suffices to find out the answer number's last 32 bits (that's what the for i in range(32):
block does) and duplicate the 32nd-to-last bit to the infinite bits before (that's what the if number & (1 << 31): number -= (1 << 32)
does).