I would like to use a permutation-based alternative to scipy.stats.ttest_1samp
to test if the mean of my observations is significantly greater than zero.
I stumbled upon scipy.stats.permutation_test
but I am not sure if this can also be used in my case? I also stumbled upon mne.stats.permutation_t_test
which seems to do what I want, but I would like to stick to scipy
if I can.
Example:
import numpy as np
from scipy import stats
# create data
np.random.seed(42)
rvs = np.random.normal(loc=5,scale=5,size=100)
# compute one-sample t-test
t,p = stats.ttest_1samp(rvs,popmean=0,alternative='greater')
This test can be performed with permutation_test
. With permutation_type='samples'
, it "permutes" the signs of the observations. Assuming data has been generated as above, the test can be performed as
from scipy import stats
def t_statistic(x, axis=-1):
return stats.ttest_1samp(x, popmean=0, axis=axis).statistic
res = stats.permutation_test((rvs,), t_statistic, permutation_type='samples')
print(res.pvalue)
If you only care about the p-value, you can get the same result with np.mean
as the statistic instead of t_statistic
.
Admittedly, this behavior for permutation_type='samples'
with only one sample is a bit buried in the documentation.
Accordingly, if data contains only one sample, then the null distribution is formed by independently changing the sign of each observation.
But a test producing the same p-value could also be performed as a two-sample test in which the second sample is the negative of the data. To avoid special cases, that's actually what permutation_test
does under the hood.
In this case, the example code above is a lot faster than permutation_test
right now. I'll try to improve that for SciPy 1.10, though.