I have this simple C++ program with unexpected output:
#include<random>
#include<iostream>
#include "boost/random/mersenne_twister.hpp"
#include "boost/random/uniform_int_distribution.hpp"
int main(){
std::cout << sizeof(std::mt19937) << std::endl;
std::cout << sizeof(std::mt19937_64) << std::endl;
std::cout << sizeof(boost::random::mt19937) << std::endl;
std::cout << sizeof(boost::random::mt19937_64) << std::endl;
}
5000
2504
2504
2504
What I find interesting is that sizeof standard implementation of mt19937(32bit one) is around 2x the the boost version, while 64bit ones match perfectly.
Since MT uses a lot of space it is not really a small difference.
Also it is weird that implementation of a strictly specified algorithm would have such different sizeof, we are not talking about std::string where implementers might pick different SSO buffer size...
My best guess would be that boost either has a bug or that it implements some slightly different version of mt19937, but wikipedia says this, suggesting boost might be right:
Relatively large state buffer, of 2.5 KiB,
edit: both boost and std version seem to satisfy requirement that 1000th generated value is 4123659995, so there seems to be no bug in boost.
This is the standard definition:
mersenne_twister_engine<
uint_fast32_t, // element of the buffer
32,
624, // size of the buffer
397, 31,
0x9908b0df, 11,
0xffffffff, 7,
0x9d2c5680, 15,
0xefc60000, 18, 1812433253>
The issue is that GNU chose that std::uint_fast32_t
is a 64 bit type on 64 bit systems (whether that is a good or bad choice is a separate discussion). Thus, the buffer is twice as big as one would have expected if the buffer contained 32 bit integers.
This is the Boost definition:
mersenne_twister_engine<
uint32_t,
32,
624,
397, 31,
0x9908b0df, 11,
0xffffffff, 7,
0x9d2c5680, 15,
0xefc60000, 18, 1812433253>
Which is identical, except for using a fixed width element which is alway same on all systems.
You can use std::mersenne_twister_engine
directly with std::uint_least32_t
element to get around this issue. Using this alias is preferable to the fixed alias because it is required to be supported on all systems.