pythonrandomseed

Replicate the generation of random number generators using Seedsequence in cpp


I am working on the conversion of Gibb's sampler written in python to cpp. Though the code works fine, there is reproducibility issue. The Python code is able to generate the similar output for any set of seeds wheread cpp code results are inconsistent across different runs.

The following snippet of code is used for generating the random numbers for different iterations in Python:

ss=np.random.SeedSequence()
child_states = ss.spawn(n) #n is the number of iterations
for i in range(0,n):   
     inference(child_states[i]); 
def inference(seed):
     child_states = seed.spawn(len(Docs))
     for i in range(len(Docs)):
            seq(seed);
def seq(seed):
     rng = np.random.default_rng(seed)
     for i in range(0,V):
        z=rng.multinomial(................)

And following is the snippet tried in c++ using pcg64 object:

void inference(){
     pcg_extras::seed_seq_from<std::random_device> seedSource;
     pcg64 rng(seedSource);
     for(int i=0;i<m;i++)
         gsl_rng_set(r, rng());
         for(int j=0;j<V;j++)
             gsl_ran_multinomial(..................)
                
}

Is there a way to achieve the same pseudo random number generation steps in CPP?

Modification update: For a small number of samples, I am able to resolve the reproducibility issue. So what I tried is

    std::vector<unsigned int> seed_values(10);
    std::generate(seed_values.begin(), seed_values.end(),        
    std::ref(rd));
    std::seed_seq seq(seed_values.begin(), seed_values.end()); 

generate a set of m seeds and use that to generate m sequences of seeds like this

    std::seed_seq seq{seed};
    std::vector<std::uint64_t> seeds(m);
    seq.generate(seeds.begin(), seeds.end()); 

This works fine for a small number of samples but when the number of samples is large, seems that two samples are assigned similar random numbers. Will the number of bits of seed be an issue as the number of bits of seed is at least 128 bits in Python, whereas in cpp it is up to 64 bits.


Solution

  • In C++ you typically initialise the pseudo-random number generators as follows:

    std::default_random_engine e((std::random_device())());
    

    which produces a random number generator with more or less randomly chosen seed.

    If you replace this with a fixed seed then you will receive the identical sequence of pseudo-random numbers no matter how often you let the compiled programme run; try with e.g.

    std::default_random_engine e(1012);
    std::uniform_int_distribution<int> uid(0, 1000);
    for(size_t i = 1; i <= 100; ++i)
    {
        std::cout << std::setw(3) << uid(e) << (i % 10 ? ' ' : '\n');
    }
    std::cout.flush();
    

    If you parse a command line parameter you could dynamically exchange sequences while still having control over when to receive identical sequences to other programme runs.

    C++ comes with quite a number of pre-implemented algorithms and distributions, see cppreference – if among these there are the same algorithm and distribution than the ones python uses then there are chances that you even are able to produce identical sequences in both python and C++ for the same seed, but no guarantee for, differences in the details of the implementation might prevent it as well.

    Above uses the C++ standard random library (which I'd rather recommend instead of some 3rd party library) as I'm not familiar with PCG library. If you prefer to continue with the latter: That one uses a seed generator, I don't know how often it will call it, though it appears reasonable to me to assume just once. If you now replace std::random_device in your code snippet with a custom one you let always return the same predefined seed in its operator() (you might adjust a static variable of to control it via command line parameters) then you might again reliably reproduce the same sequences (but you need to test on your own...).