Today I thought: well, even if there is great suspicion on RDRAND implementation of NIST SP 800-90A, it is still a hardware implementation of pseudo-random number generator (PRNG) that must be good enough for non-sensitive applications. So I thought of using it on my game instead of Mersenne Twister.
So, to see if there was any performance gain on using the instruction, I compared the time of the two following codes:
// test.cpp
#include <cstdio>
int main()
{
unsigned int rnd = 0;
for(int i = 0; i < 10000000; ++i) {
__builtin_ia32_rdrand32_step(&rnd);
}
printf("%x\n", rnd);
}
and
//test2.cpp
#include <cstdio>
#include <random>
int main()
{
unsigned int rnd = 0;
__builtin_ia32_rdrand32_step(&rnd);
std::mt19937 gen(rnd);
for(int i = 0; i < 10000000; ++i) {
rnd ^= gen();
}
printf("%x\n", rnd);
}
and by running the two I get:
$ time ./test
d230449a
real 0m0.361s
user 0m0.358s
sys 0m0.002s
$ time ./test2
bfc4e472
real 0m0.051s
user 0m0.050s
sys 0m0.002s
So, Mersenne Twister is much faster than RDRAND on my CPU. Well, I was disappointed, ruled out from my game. But RDRAND is a cryptographically secure PRNG (CSPRNG), so it does much behind the scenes... more fair would be compare it to other CSPRNG. So I took my Rabbit implementation (plain translation of the RFC to C, no fancy tricks for performance), and wrote the following test:
// test3.cpp
#include <cstdio>
extern "C"
{
#include "rabbit.h"
}
int main()
{
rabbit_state s;
unsigned long long buf[2];
__builtin_ia32_rdrand64_step(&buf[0]);
__builtin_ia32_rdrand64_step(&buf[1]);
rabbit_init_key(&s, (uint8_t*)&buf[0]);
for(int i = 0; i < 10000000; ++i) {
rabbit_extract(&s, (uint8_t*)&buf[0]);
}
printf("%llx\n", buf[0]);
}
And for my surprise, generating twice as much pseudo-random data as the first two of them, I got a better time than RDRAND:
$ time ./test3
8ef9772277b70aba
real 0m0.344s
user 0m0.341s
sys 0m0.002s
All three were compiled with optimization enabled.
So, we have a widespread paranoia that RDRAND was made to embed NSA backdoors into everybody's software cryptography. Also we have at least one software CSPRNG faster than RDRAND, and the most widely used decent PRNG, Mersenne Twister, is much faster than RDRAND. Finally, we have open-source auditable software entropy pools, like /dev/random
and /dev/urandom
, that are not hidden behind twofold scrambler layers of AES, like RDRAND.
So, the question: should people be using RDRAND? Is there any legitimate use for it? Or should we stop using it altogether?
As pointed out in the other answer, RDRAND is seeded with true randomness. In particular, it frequently reseeds its internal CSPRNG with 128 bits of hardware-generated randomness, guaranteeing a reseed at least once every 511 * 128 bits. See section 4.2.5 of this doc:
So in your examples, you used a single 128-bit seed to generate 10 million random draws from rabbit_extract. In the RDRAND version, you had the equivalent of 2.5 million 128-bit draws, meaning that the CSPRING was reseeded at least 2,500,000/511 = 4,892 times.
So instead of 128 bits of entropy going into your rabbit example, there were at least 4,892*128 = 626,176 bits of entropy going into the RDRAND example.
That's much, much more entropy than you're going to get in 0.361 seconds without hardware support. That could matter if you're doing stuff where lots of real randomness is important. One example is Shamir secret sharing of large quantities of data -- not sure if there are others.
So in conclusion -- it's not for speed, it's for high security. The question of whether it's backdoored is troubling, of course, but you can always XOR it with other sources, and at the very least it's not hurting you.