rplotcumulative-frequency

Plotting cumulative distributions with y-axis scaled to normal distribution in R


This is the first time I have a R question that I couldn't find on Stack Overflow already - forgive me if the reason why I didn't find anything is a specific term for the type of thing I'm looking for that I'm not aware of (is there?).

I'd like to display data as a cumulative frequency. Since my focus is more on the edges of the Distribution, it is helpful to scale the y-axis to a normal distribution. The result should look something like this: enter image description here

I've read about quantile-quantile plots, but honestly I can't figure out how to apply them if I want to preserve the X-axis.

I tried both base graphics and ggplot2, but can't figure it out. My current solution is therefore, for example

plot(ecdf(trees$Volume))

or

ggplot(data=trees, aes(Volume)) + stat_ecdf()

Solution

  • I think you are looking for the scales package and the probability_trans() function:

    Without transforming the y scales:

    require(ggplot2)
    
    ggplot(data = trees,
           aes(Volume)) + 
        stat_ecdf()
    

    enter image description here

    With transformation of y axis:

    ggplot(data = trees,
           aes(Volume)) + 
        stat_ecdf() + 
        scale_y_continuous(trans = scales::probability_trans("norm"))
    

    enter image description here

    You can read more about these in the documents with ?probability_trans. The probability_trans() function takes standard R probability names to scale your axis with. You can also create a new transformation with trans_new() if you need something completely custom.