I am trying to re-implement a binomial test initialy developed in R
with Python
. However, I am not sure if I am using the right functionality.
In R
, I get:
> binom.test (2, 8, 11/2364, alternative = "greater")
0.25
With Python
& SciPy
, I use
from scipy.stats import binom
binom.sf(2, 8, float(11)/float(2364))
5.5441613055814931e-06
In fact I have to do binom.sf(2, 8, float(11)/float(2364))
to make sure the third parameter is not 0
because of int division.
Why do the values differ? Do I have to specify the moments for Scipy / binom.sf
?
Should I use some other library?
Here's what I get in R:
> binom.test(2, 8, 11/2364, alternative = "greater")
Exact binomial test
data: 2 and 8
number of successes = 2, number of trials = 8, p-value = 0.0005951
alternative hypothesis: true probability of success is greater than 0.00465313
95 percent confidence interval:
0.04638926 1.00000000
sample estimates:
probability of success
0.25
>
Note that the p-value is 0.0005951.
Compare that to the result of scipy.stats.binom_test
(which returns just the p-value):
In [25]: from scipy.stats import binom_test
In [26]: binom_test(2, 8, 11/2364, alternative='greater')
Out[26]: 0.00059505960517880572
So that agrees with R.
To use the survival function of scipy.stats.binom
, you have to adjust the first argument (as noted in a comment by Marius):
In [27]: from scipy.stats import binom
In [28]: binom.sf(1, 8, 11/2364)
Out[28]: 0.00059505960517880572
(I am using Python 3, so 11/2364
equals 0.004653130287648054
. If you are using Python 2, be sure to write that fraction as 11.0/2364
or float(11)/2364
.)