Will modern (2008/2010) incantations of Visual Studio or Visual C++ Express produce x86 MUL instructions (unsigned multiply) in the compiled code? I cannot seem to find or contrive an example where they appear in compiled code, even when using unsigned types.
If VS does not compile using MUL, is there a rationale why?
imul
(signed) and mul
(unsigned) both have a one-operand form that does edx:eax = eax * src
. i.e. a 32x32b => 64b full multiply (or 64x64b => 128b).
186 added an imul dest(reg), src(reg/mem), immediate
form, and 386 added an imul r32, r/m32
form, both of which which only compute the lower half of the result. (According to NASM's appendix B, see also the x86 tag wiki)
When multiplying two 32-bit values, the least significant 32 bits of the result are the same, whether you consider the values to be signed or unsigned. In other words, the difference between a signed and an unsigned multiply becomes apparent only if you look at the "upper" half of the result, which one-operand imul
/mul
puts in edx
and two or three operand imul
puts nowhere. Thus, the multi-operand forms of imul
can be used on signed and unsigned values, and there was no need for Intel to add new forms of mul
as well. (They could have made multi-operand mul
a synonym for imul
, but that would make disassembly output not match the source.)
In C, results of arithmetic operations have the same type as the operands (after integer promotion for narrow integer types). If you multiply two int
together, you get an int
, not a long long
: the "upper half" is not retained. Hence, the C compiler only needs what imul
provides, and since imul
is easier to use than mul
, the C compiler uses imul
to avoid needing mov
instructions to get data into / out of eax
.
As a second step, since C compilers use the multiple-operand form of imul
a lot, Intel and AMD invest effort into making it as fast as possible. It only writes one output register, not e/rdx:e/rax
, so it was possible for CPUs to optimize it more easily than the one-operand form. This makes imul
even more attractive.
The one-operand form of mul
/imul
is useful when implementing big number arithmetic. In C, in 32-bit mode, you should get some mul
invocations by multiplying unsigned long long
values together. But, depending on the compiler and OS, those mul
opcodes may be hidden in some dedicated function, so you will not necessarily see them. In 64-bit mode, long long
has only 64 bits, not 128, and the compiler will simply use imul
.