python-3.xrandomnumpy-randomcomplex-networksbernoulli-numbers

How to efficiently perform billions of Bernoulli extractions using numpy?


I am working at a thesis about epidemiology, and I have to simulate a SI epidemic in a temporal network. At each time step there's a probability ~ Bernoulli(beta) to perform an extraction between an infected and a susceptible node. I am using np.random.binomial(size=whatever, n=1, p=beta) to make the computer decide. Now, I have to simulate the epidemic in the same network by making it start from each one of the nodes. This should be repeated K times to get some statistically relevant results for each node, and, since the temporal network is stochastic too, everything should be repeated NET_REALIZATION times.

So, in a network with N = 100, if K=500 and NET=REALIZATION=500, the epidemic should be repeated 25,000,000‬ times. If T=100, it means 2,500,000,000‬ extractions per set of S-I couples (which of course varies in time). If beta is small, which is often the case, this leads to a very time-spending computation. If you think that, for my computer, the bernoulli extraction takes 3.63 µs to happen, this means I have to wait hours to get some results, which is really limitating the development of my thesis. The problem is that more than half of the time is just spent in random extractions. I should use numpy since the results of extractions interact with other data structures. I tried to use numba, but it didn't seem to improve extractions' speed. Is there a faster way to get the same results? I was thinking about doing a very very big extraction once forever, something like 10^12 extractions of 0s and 1s, and just import a part of them for each different simulation (this should be repeated for several values of beta), but I wonder if there's a smarter move.

Thanks for help


Solution

  • If you can express your betas as increments of 2^-N (for example, increments of 1/256 if N is 8.), then extract random N-bit chunks and determine whether each chunk is less than beta * 2^N. This works better if 32 is evenly divisible by N.

    Note that numpy.random.uniform produces random floating-point numbers, and is expected to be slower than producing random integers or bits. This is especially because generating random floating-point numbers depends on generating random integers — not the other way around.

    The following is an example of how this idea works.

    import numpy
    
    # Fixed seed for demonstration purposes
    rs = numpy.random.RandomState(777778)
    # Generate 10 integers in [0, 256)
    ri = rs.randint(0, 256, 10)
    # Now each integer x can be expressed, say, as a Bernoulli(5/256)
    # variable which is 0 if x < 5, and 1 otherwise.  I haven't tested
    # the following, which is similar to an example you gave in a
    # comment.
    rbern = (ri>=5) * 1
    

    If you can use NumPy 1.17 or later, the following alternative exists:

    import numpy
    
    rs = numpy.random.default_rng()
    ri = rs.integers(0, 256, 10)
    

    Note also that NumPy 1.17 introduces a new random number generation system alongside the legacy one. Perhaps it has better performance generating Bernoulli and binomial variables than the old one, especially because its default RNG, PCG64, is lighter-weight than the legacy system's default, Mersenne Twister. The following is an example.

    import numpy
    
    beta = 5.0/256
    rs = numpy.random.default_rng()
    rbinom = rs.binomial(10, beta)