pythonnormal-distributiont-teststatistical-test

How to check wether two mean values are significantly different?


I have two distributions with a size of ~120 and ~86.000 elements. I would like to check wether the mean value of the two distributions are significantly different.

I found, that I can use a Welch’s t-test for that, but this test still requires that the distributions are normal.

I used scipy.stats.normaltest() to check if they are a normal distribution, but the tests failed. However, I read that the test will almost always fail for large sample sizes and that the distributions don't have to be exactly normal.

How can I check if my distributions are good enough for a Welch’s t-test or are there any other methods than a t-test that I can use to determine wether the mean values of my two distributions are significantly different?

Here are the distributions in question:

enter image description here


Solution

  • You can try using the Mann–Whitney U test. It can be applied to non-normal distributions (it is a non-parametric test). It does make some assumptions, however, that you need to check:

    1. The observations are independent.
    2. The observations are ordinal (comparable).

    Implementation: scipy.stats.mannwhitneyu.