ggplot2formattingpointevaluate

Can ggplot format points with '<' values on-the-fly or do I need a separate data column?


How can I handle LOQ values in ggplot? I would like to use the original data:

Parameter Concentration
Fe <2
Fe 12
Fe 7

and just mark the <-values in some way, either with a "<" or a different colour.

I realize that having a "<" turns my numeric data into string data. Is there a way to convert and format the plotted point on-the-fly, or must I create a supplementary column with the <-value? I guess formatting the individual <-point in the plot will require a separate column anyways? And perhaps ggplot cannot evaluate each individual value to check for "<"?


Solution

  • Easier to post answer here rather than comment. I've added two columns using base R to give you a couple of options:

    library(ggplot2)
    
    # Sample data
    df <- data.frame(Parameter = "Fe",
                     Concentration = c("<2", 7, 12, 5, 10, 11, 6, 4, 8, <3, "13"))
    
    # Create integer copy of Concentration
    df$LOQ <- as.integer(gsub("<", "", df$Concentration))
    # Create column to identify < integers
    df$less_than <- ifelse(grepl("^<", df$Concentration), "<", NA)
    
    # Use new columns to plot data and identify < values
    ggplot() +
      geom_point(data = df, aes(x = Parameter, y = LOQ)) +
      geom_text(data = df, aes(x = Parameter, y = LOQ, 
                               label = less_than,
                               colour = "red"),
                hjust = -1) +
      theme(legend.position = "none")
    

    Note that the ggplot() function will throw this warning:

    Warning message: Removed 9 rows containing missing values (geom_text()).

    You can safely ignore this, it just means the NA values in the "less_than" column were ignored by ggplot2. And the result:

    result

    If you want to plot the < values using a different colour, change the values in the "less_than" column to something like the example below:

    # Create column to identify < integers
    df$less_than <- ifelse(grepl("^<", df$Concentration), "<", "=")
    
    ggplot() +
      geom_point(data = df, aes(x = Parameter, y = LOQ, colour = less_than)) +
      scale_colour_manual(values = c("red", "blue")) +
      theme(legend.position = "none")
    

    Result:

    result2