How can I handle LOQ values in ggplot? I would like to use the original data:
Parameter | Concentration |
---|---|
Fe | <2 |
Fe | 12 |
Fe | 7 |
and just mark the <-values in some way, either with a "<" or a different colour.
I realize that having a "<" turns my numeric data into string data. Is there a way to convert and format the plotted point on-the-fly, or must I create a supplementary column with the <-value? I guess formatting the individual <-point in the plot will require a separate column anyways? And perhaps ggplot cannot evaluate each individual value to check for "<"?
Easier to post answer here rather than comment. I've added two columns using base R to give you a couple of options:
library(ggplot2)
# Sample data
df <- data.frame(Parameter = "Fe",
Concentration = c("<2", 7, 12, 5, 10, 11, 6, 4, 8, <3, "13"))
# Create integer copy of Concentration
df$LOQ <- as.integer(gsub("<", "", df$Concentration))
# Create column to identify < integers
df$less_than <- ifelse(grepl("^<", df$Concentration), "<", NA)
# Use new columns to plot data and identify < values
ggplot() +
geom_point(data = df, aes(x = Parameter, y = LOQ)) +
geom_text(data = df, aes(x = Parameter, y = LOQ,
label = less_than,
colour = "red"),
hjust = -1) +
theme(legend.position = "none")
Note that the ggplot()
function will throw this warning:
Warning message: Removed 9 rows containing missing values (
geom_text()
).
You can safely ignore this, it just means the NA values in the "less_than" column were ignored by ggplot2
. And the result:
If you want to plot the < values using a different colour, change the values in the "less_than" column to something like the example below:
# Create column to identify < integers
df$less_than <- ifelse(grepl("^<", df$Concentration), "<", "=")
ggplot() +
geom_point(data = df, aes(x = Parameter, y = LOQ, colour = less_than)) +
scale_colour_manual(values = c("red", "blue")) +
theme(legend.position = "none")
Result: