pythonrandomdistributionstochasticbeta-distribution

efficient sampling from beta-binomial distribution in python


for a stochastic simulation I need to draw a lot of random numbers which are beta binomial distributed.

At the moment I implemented it this way (using python):

import scipy as scp
from scipy.stats import rv_discrete

class beta_binomial(rv_discrete):
       """
       creating betabinomial distribution by defining its pmf
       """
       def _pmf(self, k, a, b, n):
          return scp.special.binom(n,k)*scp.special.beta(k+a,n-k+b)/scp.special.beta(a,b)

so sampling a random number x can be done by:

betabinomial = beta_binomial(name="betabinomial")
x = betabinomial.rvs(0.5,0.5,3) # with some parameter 

The problem is, that sampling one random number takes ca. 0.5ms, which is in my case dominating the whole simulation speed. The limiting element is the evaluation of the beta functions (or gamma functions within these).

Does anyone has a great idea how to speed up the sampling?


Solution

  • Well, here is working and lightly tested code which seems to be faster, using compound distribution property of Beta-Binomial.

    We sample p from beta and then using it as parameter for binomial. If you would sample large sized vectors, it would be even faster.

    import numpy as np
    
    def sample_Beta_Binomial(a, b, n, size=None):
        p = np.random.beta(a, b, size=size)
        r = np.random.binomial(n, p)
    
        return r
    
    np.random.seed(777777)
    q = sample_Beta_Binomial(0.5, 0.5, 3, size=10)
    print(q)
    

    Output is

    [3 1 3 2 0 0 0 3 0 3]
    

    Quick test

    np.random.seed(777777)
    
    n = 10
    a = 2.
    b = 2.
    N = 100000
    
    q = sample_Beta_Binomial(a, b, n, size=N)
    
    h = np.zeros(n+1, dtype=np.float64) # histogram
    for v in q: # fill it
        h[v] += 1.0
    
    h /= np.float64(N) # normalization
    print(h)
    

    prints histogram

    [0.03752 0.07096 0.09314 0.1114  0.12286 0.12569 0.12254 0.1127  0.09548 0.06967 0.03804]
    

    which is quite similar to green graph in the Wiki page on Beta-Binomial