pythonrscipybinomial-cdf

Binomial test in Python vs R


I am trying to re-implement a binomial test initialy developed in R with Python. However, I am not sure if I am using the right functionality.

In R, I get:

> binom.test (2, 8, 11/2364, alternative = "greater")
0.25

With Python & SciPy, I use

from scipy.stats import binom
binom.sf(2, 8, float(11)/float(2364))
5.5441613055814931e-06

In fact I have to do binom.sf(2, 8, float(11)/float(2364)) to make sure the third parameter is not 0 because of int division.

Why do the values differ? Do I have to specify the moments for Scipy / binom.sf? Should I use some other library?


Solution

  • Here's what I get in R:

    > binom.test(2, 8, 11/2364, alternative = "greater")
    
        Exact binomial test
    
    data:  2 and 8
    number of successes = 2, number of trials = 8, p-value = 0.0005951
    alternative hypothesis: true probability of success is greater than 0.00465313
    95 percent confidence interval:
     0.04638926 1.00000000
    sample estimates:
    probability of success 
                      0.25 
    
    >
    

    Note that the p-value is 0.0005951.

    Compare that to the result of scipy.stats.binom_test (which returns just the p-value):

    In [25]: from scipy.stats import binom_test
    
    In [26]: binom_test(2, 8, 11/2364, alternative='greater')
    Out[26]: 0.00059505960517880572
    

    So that agrees with R.

    To use the survival function of scipy.stats.binom, you have to adjust the first argument (as noted in a comment by Marius):

    In [27]: from scipy.stats import binom
    
    In [28]: binom.sf(1, 8, 11/2364)
    Out[28]: 0.00059505960517880572
    

    (I am using Python 3, so 11/2364 equals 0.004653130287648054. If you are using Python 2, be sure to write that fraction as 11.0/2364 or float(11)/2364.)