I have two distributions with a size of ~120 and ~86.000 elements. I would like to check wether the mean value of the two distributions are significantly different.
I found, that I can use a Welch’s t-test for that, but this test still requires that the distributions are normal.
I used scipy.stats.normaltest()
to check if they are a normal distribution, but the tests failed. However, I read that the test will almost always fail for large sample sizes and that the distributions don't have to be exactly normal.
How can I check if my distributions are good enough for a Welch’s t-test or are there any other methods than a t-test that I can use to determine wether the mean values of my two distributions are significantly different?
Here are the distributions in question:
You can try using the Mann–Whitney U test. It can be applied to non-normal distributions (it is a non-parametric test). It does make some assumptions, however, that you need to check:
Implementation: scipy.stats.mannwhitneyu
.