When implementing bignums on x86, obviously the most efficient choice for digit size is 32 bits. However, you need arithmetic up to twice the digit size (i.e. 32+32=33, 32*32=64, 64/32=32). Fortunately, not only does x86 provide this, but it's also accessible from portable C (uint64_t
).
Similarly, on x64 it would be desirable to use 64-bit digits. This would require 128 bit arithmetic (i.e. 64+64=65, 64*64=128, 128/64=64). Fortunately, x64 provides this. Unfortunately, it's not accessible from portable C, though obviously one could dip into assembly.
So my question is whether it's accessible from non-portable C. Do any C compilers on x64 provide access to this, and if so, what's the syntax?
(Note that I'm not talking about 128-bit vectors that are strictly treated as collections of 32 or 64-bit words with no carry propagation between them, but about actual 128-bit integer operations.)
GCC 4.1 introduced initial 128-bit integer support with the __int128_t
and __uint128_t
built-in types but 128-bit type was officially released since GCC 4.6 as __int128
/ unsigned __int128
Clang also supports those types although I don't know since when. The first version on Godbolt (3.0.0) does support __int128_t
though
ICC gained the same support since version 13.0.0: 128-bit integers supporting +, -, *, /, and % in the Intel C Compiler?
See also
If you're on MSVC then there's no direct support for a 128-bit type but there are many intrinsics helping you do 128-bit operations:
64*64→128: _mul128()
, _umul128()
, __mulh()
, __umulh()
128/64→64: _div128()
, _udiv128()
64+64→65: The carry in an addition can be easily obtained by comparing the low part of the sum with any of the operands:
struct uint128 {
uint64_t H, L;
};
inline uint128 add(uint128 a, uint128 b)
{
uint128 c;
c.L = a.L + b.L; // add low parts
c.H = a.H + b.H + (c.L < a.L); // add high parts and carry
return c;
}
64-64→65: We can use the same method for subtraction as in addition
inline uint128 sub(uint128 a, uint128 b)
{
uint128 c;
c.L = a.L - b.L;
c.H = a.H - b.H - (c.L > a.L);
return c;
}
There are also intrinsics for shifting although implementing these is trivial: __shiftleft128()
, __shiftright128()
If you're on an unsupported compiler then just use some fixed-width types from many available libraries, that would be much faster. For example ttmath:UInt<4>
(a 128-bit int type with four 32-bit limbs), or (u)int128_t
in Boost.Multiprecision and calccrypto/uint128_t. An arbitrary-precision arithmetic library like GMP is just too costly for this. One example: Optimization story: Switching from GMP to gcc's __int128
reduced run time by 95%