I'm trying to compare success rates in two samples. I'm treating each sample as a binomial distribution, then trying to compare the success rates with a two tailed hypothesis test. (x1 is the number of successes in sample 1, and n1 is the total number of trials, similarly for sample 2.) I believe the code I have below is very similar to the process described in this post
for performing a hypothesis test for success rate between two groups. however the pVal I'm getting is 1.82372486268966, which doesn't make sense. is there something I'm misunderstanding about the theory for this hypothesis test, or is there an error in my code below?
code:
import math
import scipy.stats as stats
x1=195
x2=5481
n1=135779
n2=81530
hatP = (n1*(x1/n1) + n2*(x2/n2))/(n1 + n2)
hatQ = 1 - hatP
hatP1 = x1/n1
hatP2 = x1/n2
Z = (hatP1 - hatP2)/(math.sqrt(hatP*hatQ*(1/n1 + 1/n2)))
pVal = 2*(1 - stats.norm.cdf(Z))
Z: -1.3523132192521408
pVal: 1.82372486268966
After correcting the hatP2 = x1/n2
to hatP1 = x2/n2
, your statistic is approximately normally distributed under the null hypothesis that the two samples were drawn from the same binomial distribution.
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
def statistic(x1, n1, x2, n2):
hatP = (n1*(x1/n1) + n2*(x2/n2))/(n1 + n2)
hatQ = 1 - hatP
hatP1 = x1/n1
hatP2 = x2/n2 # corrected from x1/n2
Z = (hatP1 - hatP2)/(np.sqrt(hatP*hatQ*(1/n1 + 1/n2)))
return Z
n1=135779
n2=81530
n_trials = 100000
proportion_h0 = 0.001 # for example
x1 = stats.binom(n=n1, p=proportion_h0).rvs(size=n_trials)
x2 = stats.binom(n=n2, p=proportion_h0).rvs(size=n_trials)
Z = statistic(x1, n1, x2, n2)
plt.hist(Z, density=True, bins=50, label='histogram')
plt.xlabel('statistic')
plt.ylabel('frequency density')
plt.title('Histogram of statistic under H0')
x = np.linspace(-5, 5, 300)
plt.plot(x, stats.norm.pdf(x), label='standard normal PDF')
plt.legend()
The calculation of the p-value depends on the alternative hypothesis. It looks like you are interested in the two-sided null hypothesis that the samples were not drawn from the same binomial distribution, in which case the p-value would be calculated as:
x1 = 195
x2 = 5481
Z = statistic(x1, n1, x2, n2)
# twice the smaller of the one-sided pvalues
pVal = 2*np.minimum(stats.norm.cdf(Z), stats.norm.sf(Z)) # ~0
In your case, the p-value underflows; it is too small to be represented in double precision.
Rather than writing a custom test, consider using scipy.stats.fisher_exact
, scipy.stats.barnard_exact
, or scipt.stats.boschloo_exact
.