rggplot2customizationmultiple-databasespalette

Create from multiple data frames complex customized figure on ggplot2


I am working on a project where I need to combine data from two different data frames into a single plot using ggplot2 in R. I was wondering how I can efficiently achieve this, assigning specific features to each dataset in the plot.

I have two data frames, each with different variables. I would like to overlay plots for df1 and df2 on a single graph while customizing plot features such as colors, shapes, and labels for each dataset.

I want my plot to feature two background ellipses, one in purple and the other in green, along with the centroid of each ellipse. These ellipses are generated using stat_ellipse and data from df2. In the foreground, I would like the points from df1 to be visible, but with a color gradient as specified in the following script:

# Create data frame df1
df1 <- data.frame(
  x = c(1, 2, 3, 4, 5),
  y = c(2, 4, 3, 6, 5),
  color = runif(5, 0, 1))


# Create data frame df2
df2 <- data.frame(
  x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
  y = rnorm(10),
  group = rep(c('A', 'B'), each = 5))

To create the plot, I'm not sure whether the information should go directly into the 'ggplot()' code or, alternatively, into 'geom_point' and 'stat_ellipse'. Either way, I'm unsure how to adjust the colors separately for each component of the plot.

plotGG <- ggplot() +
  stat_ellipse(data = df2, aes(x = x, y = y, color = group)) + #i want those ellipses purple and green
  geom_point(data = df1, aes(x = x, y = y, color = color), size = 3) +
  scale_color_gradient2(midpoint=0.5,low="#ba1414ff",mid = "#f3f3b9ff",high="#369121ff") #this is the color pallete that i want for the geom_points

I have progressed to this point, but it generates errors one way or another. Thank you in advance!


Solution

  • In vanilla ggplot2 you can have only scale per aesthetic (whereas you want two different color scales) and either is this scale discrete or continuous (whereas you want a discrete and a continuous color scale).

    But one option to a achieve your desired result would be the ggnewscale package which allows for multiple scales for the same aesthetic:

    set.seed(123)
    
    # Create data frame df1
    df1 <- data.frame(
      x = c(1, 2, 3, 4, 5),
      y = c(2, 4, 3, 6, 5),
      color = runif(5, 0, 1)
    )
    
    # Create data frame df2
    df2 <- data.frame(
      x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
      y = rnorm(10),
      group = rep(c("A", "B"), each = 5)
    )
    
    library(ggplot2)
    library(ggnewscale)
    
    ggplot() +
      stat_ellipse(
        data = df2,
        aes(x = x, y = y, color = group)
      ) +
      scale_color_manual(
        values = c(A = "purple", B = "green")
      ) +
      ggnewscale::new_scale_color() +
      geom_point(
        data = df1,
        aes(x = x, y = y, color = color), size = 3
      ) +
      scale_color_gradient2(
        midpoint = 0.5,
        low = "#ba1414ff", mid = "#f3f3b9ff", high = "#369121ff"
      )