pythondataframeqqplot

Add log-log line in qq-plot


I have two different datasets for which I would like to plot their qq-plot together with a straight line of reference of 45 degrees. I know how to do it for the non-log transform data, however I cannot find the way to do it when I apply a log transformation to both of my datasets. Here is what is try:

    x = df['column_1'].apply(np.log)
    y = df['column_2'].apply(np.log)
    
    plt.scatter(np.sort(x), np.sort(y), alpha=0.6, edgecolor='k', s=130)

    # Plot 45 degrees reference line
    dmin, dmax = np.min([x,y]), np.max([x,y])
    diag = np.linspace(dmin, dmax, 1000)
    plt.plot(diag, diag, color='red', linestyle='--')
    plt.gca().set_aspect('equal')
 
    plt.show()

Solution

  • You will probably need to add the log value to both the x and y scale.

    Example:

    n=500
    
    x = np.log(np.linspace(10, 1000, num = n))
    y = np.log([random.randint(10, 1000) for _ in range(n)])
    
    plt.scatter(np.sort(x), np.sort(y), alpha=0.6, edgecolor='k', s=130)
    
    
    # Plot 45 degrees reference line
    dmin, dmax = np.min([x,y]), np.max([x,y])
    diag = np.linspace(dmin, dmax, 1000)
    plt.plot(diag, diag, color='red', linestyle='--')
    plt.gca().set_aspect('equal')
    
    plt.yscale('log')
    plt.xscale('log')
    
    plt.show()
    

    Output:

    Figure