c++

Fast way to avoid modulo bias


I'm doing a shuffle and it gets done very often on a small array. Could be anything from 1 - 10 elements.

I've tried the accepted answer in this question:

Is this C implementation of Fisher-Yates shuffle correct?

Unfortunately it's extremely slow.

I need a faster way of doing this and avoiding modulo bias which I'm seeing. Any suggestions?

EDIT: Sorry I should point out that it's not the shuffle that's slow, it's the method used to generate a random int range. i.e. rand_int(). I'm using a Mersenne twister algorithm and RAND_MAX in my case is UINT_MAX to help out. This of course makes it slower when n is much smaller than RAND_MAX

I've also found 2 implementations of a rand_int type function.

static int rand_int(int n) {
  int limit = RAND_MAX - RAND_MAX % n;
  int rnd;

  do {
    rnd = rand();
  } while (rnd >= limit);
  return rnd % n;
}

The following is much much faster. But, does it avoid the modulo bias problem?

int rand_int(int limit) {

    int divisor = RAND_MAX/(limit);
    int retval;

    do { 
        retval = rand() / divisor;
    } while (retval > limit);

    return retval;
}

Solution

  • Edit

    To address the basic question on avoiding the modulo bias with rand() see https://web.archive.org/web/20180801210127/http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx.

    In short, you can't get truly uniform other than skipping non-domain random numbers1; The article lists some formulae to get a smaller bias (int r = rand() / ( RAND_MAX / N + 1 ) eg) without sacrificing more performance.

    1 See Java's implementation of Random.nextInt(int): http://web.archive.org/web/20111112121235/http://download.oracle.com/javase/1.4.2/docs/api/java/util/Random.html#nextInt(int)


    Using C++

    You should be able to use std::random_shuffle (from <algorithm> header);

    If you must roll your own shuffle implementation, I suggest using std::random (TR1, C++0x or Boost). It comes with a number of generators and distributions, with varying performance characteristics.

    #include <random>
    
    std::mt19937 rng(seed);
    std::uniform_int_distribution<int> gen(0, N); // uniform, unbiased
    
    int r = gen(rng);
    

    Refer to the boost documentation for a good overview of Boost Random generator and distribution characteristics:

    Here is a sample of doing std::random_shuffle using Boost Random, directly:

    #include <algorithm>
    #include <functional>
    #include <vector>
    #include <boost/random.hpp>
    
    struct Rng
    {
        Rng(boost::mt19937 &rng) : _rng(rng) {}
    
        unsigned operator()(unsigned i) 
        {
            boost::uniform_int<> dist(0, i - 1);
            return dist(_rng);
        }
    
      private:        
        boost::mt19937 &_rng;
    };
    
    boost::mt19937 state;
    std::random_shuffle(v.begin(), v.end(), Rng(state));