pythonstatisticsscipypearson

How do you compute the confidence interval for Pearson's r in Python?


In Python, I know how to calculate r and associated p-value using scipy.stats.pearsonr, but I'm unable to find a way to calculate the confidence interval of r. How is this done? Thanks for any help :)


Solution

  • According to [1], calculation of confidence interval directly with Pearson r is complicated due to the fact that it is not normally distributed. The following steps are needed:

    1. Convert r to z',
    2. Calculate the z' confidence interval. The sampling distribution of z' is approximately normally distributed and has standard error of 1/sqrt(n-3).
    3. Convert the confidence interval back to r.

    Here are some sample codes:

    def r_to_z(r):
        return math.log((1 + r) / (1 - r)) / 2.0
    
    def z_to_r(z):
        e = math.exp(2 * z)
        return((e - 1) / (e + 1))
    
    def r_confidence_interval(r, alpha, n):
        z = r_to_z(r)
        se = 1.0 / math.sqrt(n - 3)
        z_crit = stats.norm.ppf(1 - alpha/2)  # 2-tailed z critical value
    
        lo = z - z_crit * se
        hi = z + z_crit * se
    
        # Return a sequence
        return (z_to_r(lo), z_to_r(hi))
    

    Reference:

    1. http://onlinestatbook.com/2/estimation/correlation_ci.html