I'm reading Computer Systems: A Programmer's Perspective and the homework was to describe how this algorithm works.
C function:
void store_prod(__int128 *dest, int64_t x, int64_t y) {
*dest = x * (__int128)y;
}
Assembly:
movq %rdx, %rax
cqto
movq %rsi, %rcx
sarq $63, %rcx
imulq %rax, %rcx
imulq %rsi, %rdx
addq %rdx, %rcx
mulq %rsi
addq %rcx, %rdx
movq %rax, (%rdi)
movq %rdx, 8(%rdi)
ret
I don't know why it performs: xh * yl + yh * xl = value which we add after unsigned multiplication
What GCC is doing is using the property that signed multiplication can be done using the following formula.
(hi,lo) = unsigned(x*y)
hi -= ((x<0) ? y : 0) + ((y<0) ? x : 0)
Despite the fact that there is no need to do this since in this case the x86-64 instruction set has a signed 64-bit*64-bit to 128-bit instruction (imul
with one operand) this formula is useful in other cases. For example for implementing signed 128-bit multiplication with SSE2/AVX2/AVX512 or for implementing 256-bit multiplication when the instruction set only does 128-bit multiplication (such as with x86-64).
GCC implemented this formula a bit differently though. If we take the sign bit and extend it to the whole word, call this function sign_ext
, then the function returns -1
or 0
. Then what GCC did is:
hi += sign_ext(x)*y + sign_ext(y)*x
for example sign_ext(x)*y
in pseudo-instructions for 64-bit words is
sarq $63, x ; sign_ext(x)
imulq y, x ; sign_ext(x)*y
So now you ask (or meant to ask):
Why is this formula true?
That's a good qeustion. I asked this same question as well and njuffa wrote
@Zboson: It follows directly from two's complement complement representation. E.g. 32-bit integers
-n
and-m
are represented as unsigned numbersx=2**32-n, y=2**32-m
. If you multiply those you havex*y = 2**64 - 2**32*n - 2**32*m + n*m
. The middle terms indicate the necessary corrections to the upper half of the product. Working through a simple example using -1*-1 should prove very instructive.