rggplot2data-visualizationparallel-coordinates

ggplot/GGally - Parallel Coordinates - y-axis labels


Does anyone know if there is a way to add variable labels to the ggparcoord function in GGally? I've tried numerous ways with geom_text, but nothing is yielding results.

To be more explicit, I am looking to pass the row.names(mtcars) through geom_text. The only way that I can distinguish the car is passing row.names(mtcars) through the groupColumn argument, but I don't like the way this looks.

Doesn't work:

mtcars$carName <- row.names(mtcars) # This becomes column 12
library(GGally)
# Attempt 1
ggparcoord(mtcars, 
           columns = c(12, 1, 6), 
           groupColumn = 1) +
geom_text(aes(label = carName))

# Attempt 2
ggparcoord(mtcars, 
           columns = c(12, 1, 6),
           groupColumn = 1,
           mapping = aes(label = carName))

Any ideas would be appreciated!


Solution

  • Solution 1: If you want to stick close to your original attempt, you can calculate the appropriate y coordinates for the car names, & add that as a separate data source. Use inherit.aes = FALSE so that this geom_text layer doesn't inherit anything from the ggplot object created using ggparcoord():

    library(dplyr)
    
    p1 <- ggparcoord(mtcars, 
                     columns = c(12, 1, 6), 
                     groupColumn = 1) +
    
      geom_text(data = mtcars %>%
                  select(carName) %>%
                  mutate(x = 1,
                         y = scale(as.integer(factor(carName)))),
                aes(x = x, y = y, label = carName),
                hjust = 1.1,
                inherit.aes = FALSE) +
    
      # optional: remove "carName" from x-axis labels
      scale_x_discrete(labels = function(x) c("", x[-1])) + 
    
      # also optional: hide legend, which doesn't really seem relevant here
      theme(legend.position = "none")
    p1
    

    solution 1

    Solution 2: This alternative uses carName as the group column, & doesn't pass it as one of the parallel coordinate columns. (which I think this might be closer to the use cases intended by this function...) Specifying carName as the group column allows the car name values to be captured in the data slot of the ggplot object created by ggparcoord() this time, so our geom_text label can inherit it directly, & even filter only for rows corresponding to variable == "mpg" (or whatever the first of the parallel coordinate columns is named, in the actual use case). The y coordinates are not as evenly spread out as above, but geom_text_repel from the ggrepel package does a decent job at shifting overlapping text labels away from one another.

    library(dplyr)
    library(ggrepel)
    
    p2 <- ggparcoord(mtcars, 
               columns = c(1, 6), 
               groupColumn = "carName") +
      geom_text_repel(data = . %>%
                        filter(variable == "mpg"),
                      aes(x = variable, y = value, label = carName),
                      xlim = c(NA, 1)) + # limit repel region to the left of the 1st column
      theme(legend.position = "none") # as before, hide legend since the labels 
                                      # are already in the plot
    p2
    

    solution 2

    Solution 3 / 4: You can actually plot the same with ggplot(), without relying on extensions that may do unexpected stuff behind the scenes:

    library(dplyr)
    library(tidyr)
    library(ggrepel)
    
    # similar output to solution 1
    
    p3 <- mtcars %>%
      select(carName, mpg, wt) %>%
      mutate(carName.column = as.integer(factor(carName))) %>%
      gather(variable, value, -carName) %>%
      group_by(variable) %>%
      mutate(value = scale(value)) %>%
      ungroup() %>%
    
      ggplot(aes(x = variable, y = value, label = carName, group = carName)) +
      geom_line() +
      geom_text(data = . %>% filter(variable == "carName.column"),
                hjust = 1.1) +
      scale_x_discrete(labels = function(x) c("", x[-1]))
    p3
    
    # similar output to solution 2
    
    p4 <- mtcars %>%
      select(carName, mpg, wt) %>%
      gather(variable, value, -carName) %>%
      group_by(variable) %>%
      mutate(value = scale(value)) %>%
      ungroup() %>%
    
      ggplot(aes(x = variable, y = value, label = carName, group = carName)) +
      geom_line() +
      geom_text_repel(data = . %>% filter(variable == "mpg"),
                      xlim = c(NA, 1))
    p4
    

    solutions 3 / 4

    Edit

    You can add text labels on the right as well, for each of the above. Do note that the location for labels may not be nicely spaced out, since they are positioned according to wt's scaled values:

    p1 +
      geom_text(data = mtcars %>%
                  select(carName, wt) %>%
                  mutate(x = 3,
                         y = scale(wt)),
                aes(x = x, y = y, label = carName),
                hjust = -0.1,
                inherit.aes = FALSE)
    
    p2 +
      geom_text_repel(data = . %>%
                        filter(variable == "wt"),
                      aes(x = variable, y = value, label = carName),
                      xlim = c(2, NA))
    
    p3 +
      geom_text(data = . %>% filter(variable == "wt"),
                hjust = -0.1)
    
    p4 +
      geom_text_repel(data = . %>% filter(variable == "wt"),
                      xlim = c(2, NA))
    

    combined plots