In comparing bernoulli_distribution
's default constructor (50/50 chance of true/false) and uniform_int_distribution{0, 1}
(uniform likely chance of 0 or 1) I find that bernoulli_distribution
s are at least 2x and upwards of 6x slower than uniform_int_distribution
despite the fact that they give equivalent results.
I would expect bernoulii_distribition
to perform better due to it being specifically designed for the probability of only two outcomes, true or false; yet, it doesn't.
Given the above and the below performance metrics, are there practical uses of bernoulli distributions over uniform_int_distributions?
Results over 5 runs (Release mode, x64-bit): (See edit below for release runs without the debugger attached)
bernoulli: 58 ms
false: 500690
true: 499310
uniform: 9 ms
1: 499710
0: 500290
----------
bernoulli: 57 ms
false: 500921
true: 499079
uniform: 9 ms
0: 499614
1: 500386
----------
bernoulli: 61 ms
false: 500440
true: 499560
uniform: 9 ms
0: 499575
1: 500425
----------
bernoulli: 59 ms
true: 498798
false: 501202
uniform: 9 ms
1: 499485
0: 500515
----------
bernoulli: 58 ms
true: 500777
false: 499223
uniform: 9 ms
0: 500450
1: 499550
----------
Profiling code:
#include <chrono>
#include <random>
#include <iostream>
#include <unordered_map>
int main() {
auto gb = std::mt19937{std::random_device{}()};
auto bd = std::bernoulli_distribution{};
auto bhist = std::unordered_map<bool, int>{};
auto start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
bhist[bd(gb)]++;
}
auto end = std::chrono::steady_clock::now();
auto dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "bernoulli: " << dif.count() << " ms\n";
std::cout << std::boolalpha;
for(auto& b : bhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << std::noboolalpha;
std::cout << '\n';
auto gu = std::mt19937{std::random_device{}()};
auto u = std::uniform_int_distribution<int>{0, 1};
auto uhist = std::unordered_map<int, int>{};
start = std::chrono::steady_clock::now();
for(int i = 0; i < 1'000'000; ++i) {
uhist[u(gu)]++;
}
end = std::chrono::steady_clock::now();
dif = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "uniform: " << dif.count() << " ms\n";
for(auto& b : uhist) {
std::cout << b.first << ": " << b.second << '\n';
}
std::cout << '\n';
}
EDIT
I re-ran the test without debugging symbols attached and bernoulli still ran a good 4x slower:
bernoulli: 37 ms
false: 500250
true: 499750
uniform: 9 ms
0: 500433
1: 499567
-----
bernoulli: 36 ms
false: 500595
true: 499405
uniform: 9 ms
0: 499061
1: 500939
-----
bernoulli: 36 ms
false: 500988
true: 499012
uniform: 8 ms
0: 499596
1: 500404
-----
bernoulli: 36 ms
true: 500425
false: 499575
uniform: 8 ms
0: 499974
1: 500026
-----
bernoulli: 36 ms
false: 500847
true: 499153
uniform: 8 ms
0: 500082
1: 499918
-----
Some comments and answers suggest using uniform_real_distribution
instead.
I tested uniform_real_distribution(0.0f, nextafter(1.0f, 20.f))
(to account for urd
being a half-closed range) vs bernoulli_distribution
and the bernoulli_distribution
is faster by about 20%-25% regardless of the probability (and gave more correct results. I tested 1.0
true probability and my implementation that used the above urd
values actually gave false negatives (granted one or two out of 5 one-million runs) and bernoulli
gave the correct none.
So, speed-wise: bernoulli_distribution
is faster than uniform_real_distribution
but slower than uniform_int_distribution
.
Long-story short, use the right tool for the job, don't reinvent the wheel, the STL is well-built, etc. and depending on the use-case one is better than the other.
For yes-no probability (IsPercentChance(float probability)
), bernoulli_distribution
is faster and better.
For pure "give me a random random bool value", uniform_int_distribution
is faster and better.