python-3.xhypothesis-testscipy.statsbinomial-cdf

binom_test results compared to hypothesis test with normally distributed test statistic


I have two samples with different numbers of trials in each sample, and different number of successes in each sample. I'm trying to compare the success rate between the two samples to see if there is a significant difference. I seem to get very different results depending on if I use binom_test from scipy.stats or the function below which assumes that the test statistic is normally distributed.

can someone please tell me if I'm applying the binom_test incorrectly, or if there's an error/I'm incorrectly using the function below?

I got the function from a SO post, it seems like hatP might be incorrect.

I have sample data and the results from binom_test and the function below. binom_test is getting a p-value it basically rounds down to 0 while the function is getting a p-value 1.82 which doesn't even make sense.

function from SO post https://stats.stackexchange.com/questions/81091/is-it-possible-to-do-a-test-of-significance-for-a-string-occurrence-in-two-datas

# 2 sample binom


def fnDiffProp(x1, x2, n1, n2):
    '''
    inputs:
    x1: the number of successes in the first sample
    x2: the number of successes in the second sample
    n1: the total number of 'trials' in the first sample
    n2: the total number of 'trials' in the second sample
    output:
    the test statistic, and the p-value as a tuple
    '''
    
    import math
    import scipy.stats as stats
    
    hatP = (x1 + x2)/(n1 + n2)
    hatQ = 1 - hatP
    hatP1 = x1/n1
    hatP2 = x1/n2
    Z = (hatP1 - hatP2)/(math.sqrt(hatP*hatQ*(1/n1 + 1/n2)))
    pVal = 2*(1 - stats.norm.cdf(Z))
    return((Z, pVal))



sample 1

195 successes
135779 trials

sample 2

5481 successes
81530 trials


results from binom_test

binom_test(x=5481, n=81530, p=0.0014, alternative='greater')

0.0

binom_test(x=5481, n=81530, p=0.0014, alternative='two-sided')
0.0


fnDiffProp(x1=195, x2=5481, n1=135779, n2=81530)

(-1.3523132192521408, 1.82372486268966)

update:

I ran proportions_ztest from statsmodels and got the results below, similar to the results from binom_test. In one of the tests below I randomly sampled equal samples from both groups. in either case the p value was so small it got rounded to 0.

number_of_successes = [5481, 195]
total_sample_sizes = [81530, 135779]
# Calculate z-test statistic and p-value
test_stat, p_value = proportions_ztest(number_of_successes, total_sample_sizes, alternative='larger')

print(str(test_stat))
print(str(p_value))

93.10329278601503
0.0


number_of_successes = [5389, 119]
total_sample_sizes = [80000, 80000]
# Calculate z-test statistic and p-value
test_stat, p_value = proportions_ztest(number_of_successes, total_sample_sizes, alternative='larger')

print(str(test_stat))
print(str(p_value))


72.26377467032772
0.0

Solution

  • can someone please tell me if I'm applying the binom_test incorrectly, or if there's an error/I'm incorrectly using the function below?

    scipy.stats.binom_test tests the null hypothesis that a given sample was drawn from a binomial distribution with a hypothesized probability of success. It is a one-sample test that compares a sample against a hypothesized probability distribution.

    You said "I'm trying to compare the success rate between the two samples to see if there is a significant difference." So you want a two-sample tests that assesses whether two samples were drawn from the same binomial distribution (with unknown probability of success).

    The scenarios are quite different, so scipy.stats.binom_test cannot be used to solve your problem.

    In your other issue, I showed how your custom test fnDiffProp can be corrected to solve your problem. In fact, it produces a statistic and p-value that are identical to those of proportions_ztest. Other tests you could consider in this context are listed in that post.

    https://stackoverflow.com/a/77422932/6036253