This is the first time I have a R question that I couldn't find on Stack Overflow already - forgive me if the reason why I didn't find anything is a specific term for the type of thing I'm looking for that I'm not aware of (is there?).
I'd like to display data as a cumulative frequency. Since my focus is more on the edges of the Distribution, it is helpful to scale the y-axis to a normal distribution. The result should look something like this:
I've read about quantile-quantile plots, but honestly I can't figure out how to apply them if I want to preserve the X-axis.
I tried both base
graphics and ggplot2
, but can't figure it out. My current solution is therefore, for example
plot(ecdf(trees$Volume))
or
ggplot(data=trees, aes(Volume)) + stat_ecdf()
I think you are looking for the scales
package and the probability_trans()
function:
Without transforming the y scales:
require(ggplot2)
ggplot(data = trees,
aes(Volume)) +
stat_ecdf()
With transformation of y axis:
ggplot(data = trees,
aes(Volume)) +
stat_ecdf() +
scale_y_continuous(trans = scales::probability_trans("norm"))
You can read more about these in the documents with ?probability_trans
.
The probability_trans()
function takes standard R probability names to scale your axis with.
You can also create a new transformation with trans_new()
if you need something completely custom.