pythonscipypermutationt-testscipy.stats

Permutation-based alternative to scipy.stats.ttest_1samp


I would like to use a permutation-based alternative to scipy.stats.ttest_1samp to test if the mean of my observations is significantly greater than zero. I stumbled upon scipy.stats.permutation_test but I am not sure if this can also be used in my case? I also stumbled upon mne.stats.permutation_t_test which seems to do what I want, but I would like to stick to scipy if I can.

Example:

import numpy as np
from scipy import stats

# create data
np.random.seed(42)
rvs = np.random.normal(loc=5,scale=5,size=100)

# compute one-sample t-test 
t,p = stats.ttest_1samp(rvs,popmean=0,alternative='greater')

Solution

  • This test can be performed with permutation_test. With permutation_type='samples', it "permutes" the signs of the observations. Assuming data has been generated as above, the test can be performed as

    from scipy import stats
    def t_statistic(x, axis=-1):
        return stats.ttest_1samp(x, popmean=0, axis=axis).statistic
    
    res = stats.permutation_test((rvs,), t_statistic, permutation_type='samples')
    print(res.pvalue)
    

    If you only care about the p-value, you can get the same result with np.mean as the statistic instead of t_statistic.

    Admittedly, this behavior for permutation_type='samples' with only one sample is a bit buried in the documentation.

    Accordingly, if data contains only one sample, then the null distribution is formed by independently changing the sign of each observation.

    But a test producing the same p-value could also be performed as a two-sample test in which the second sample is the negative of the data. To avoid special cases, that's actually what permutation_test does under the hood.

    In this case, the example code above is a lot faster than permutation_test right now. I'll try to improve that for SciPy 1.10, though.