Let's take the example of sign-extending a 16-bit signed number into a 32-bit register, such as mov $+/-5, %ax
movswl %ax, %ebx
.
There are two possible cases:
High bit is zero (number is positive). This is very easy to understand and intuitive. For example, if I have the number 5
, left-padding with zeros is very easy to understand. For example:
00000000 00000101 # 5 (represented in 16 bits)
00000000 00000000 00000000 00000101 # 5 (represented in 32 bits)
However, the tricky thing for me to understand is when it's a negative number and we sign-extend. Example:
11111111 11111011 # -5 (represented in 16 bits)
11111111 11111111 11111111 11111011 # -5 (represented in 32 bits)
Yes, I know that we just fill the upper bits with 1
. But what makes that work? Perhaps sort of an explanation on what 'properties' of the binary number makes that possible would help me better understand this.
For an n+1
-bit 2's complement number:
-(2^n)
2^(n-1)
, and so on (normal binary place value)For example, in 8-bit 2's complement, the bit-pattern with just the MSB set represents a value of -128 = -(2^7)
. With the top two bits set, it represents -128 + 64 = -64
.
When we extend by 1 bit, the original sign bit is now a "regular" bit with place value +(2^n)
instead of -(2^n)
, so the value represented by the existing bits is now 2^n + 2^n = 2^(n+1)
higher than the original value. (Or the same if that bit was zero).
The new sign bit has a place value of -(2^(n+1))
, so copying the original sign bit is exactly what we need to balance that change in place-value. (Or leaves it unchanged if zero).
The procedure for one bit of course generalizes by repetition for any number of bits.
(Normally we'd use n
= number of total bits, like n=8 instead of n=7 for 8-bit 2's complement. Then the MSB's place value is -2^(n-1)
, and changing its meaning by extending with a new sign bit adds 2^n
if it was set.)
See wikipedia for more about how the bits represent values: https://en.wikipedia.org/wiki/Two%27s_complement#Converting_from_two's_complement_representation - the 2's complement article is quite good but doesn't go into detail about why copying the sign bit works.
You could also try it on paper for some small examples, like sign-extending from 4 to 5 bits. -1
(all-ones) would be a good value to start with, making the math simple. Or 0b1000
(-8) is another good choice.
Google found https://andybargh.com/binary-sign-extension/ which works through one 8-bit example.