rplotdistributionridgeline-plot

Using base R, how to create a "joy plot" (aka ridgeline plots), with many distributions on top of each other with vertical offset?


The type of plot I am trying to achieve in R seems to have been known as either as moving distribution, as joy plot or as ridgeline plot:

joy-plot

There is already a question in Stackoverflow whose recorded answer explains how to do it using ggplot: How to reproduce this moving distribution plot with R?

However, for learning purposes, I am trying to achieve the same using only base R plots (no lattice, no ggplot, no any plotting package).

In order to get started, I generated the following fake data to play with:

set.seed(2020)
shapes <- c(0.1, 0.5, 1, 2, 4, 5, 6)
dat <- lapply(shapes, function(x) rbeta(1000, x, x))
names(dat) <- letters[1:length(shapes)]

Then using mfrow I can achieve this:

par(mfrow=c(length(shapes), 1))
par(mar=c(1, 5, 1, 1))
for(i in 1:length(shapes))
{
    values <- density(dat[[names(dat)[i]]])
    plot(NA,
         xlim=c(min(values$x), max(values$x)),
         ylim=c(min(values$y), max(values$y)),
         axes=FALSE,
         main="",
         xlab="",
         ylab=letters[i])
    polygon(values, col="light blue")
}

The result I get is:

r-device-plot

Clearly, using mfrow (or even layout) here is not flexible enough and also does allow for the overlaps between the distributions.

Then, the question: how can I reproduce that type of plot using only base R plotting functions?


Solution

  • Here's a base R solution. First, we calculate all the density values and then manually offset off the y axis

    vals <- Map(function(x, g, i) {
      with(density(x), data.frame(x,y=y+(i-1), g))
    }, dat, names(dat), seq_along(dat))
    

    Then, to plot, we calculate the overall range, draw an empty plot, and the draw the densities (in reverse so they stack)

    xrange <- range(unlist(lapply(vals, function(d) range(d$x))))
    yrange <- range(unlist(lapply(vals, function(d) range(d$y))))
    plot(0,0, type="n", xlim=xrange, ylim=yrange, yaxt="n", ylab="", xlab="Value")
    for(d in rev(vals)) {
      with(d, polygon(x, y, col="light blue"))
    }
    axis(2, at=seq_along(dat)-1, names(dat))
    

    enter image description here