I have some data with a bunch of x values paired with different y values. I want to plot it and highlight the max y value for each x with a different color or line or something, as in the image.
library(tidyverse)
x <- c(1,1,1,1,1, 2,2,2, 3,3,3,3,3, 4,4, 5, 6,6,6,6,6,6)
y <- c(10,11,12,13,14, 22,20,21, 8,12,14,15,18, 9,10, 5, 22,21,20,9,7,5)
df <- data.frame(x,y)
positions <- unique(x)
maxes <- c(14,22,18,10,5,22)
specials <- data.frame(positions,maxes)
ggplot(df,aes(x,y))+geom_point()+geom_line(data=specials,aes(x=positions,y=maxes))
My actual dataset is huge and it's not practical to manually hunt for these maxima. How can I achieve the desired result programmatically?
It may work for you?
library(dplyr)
df2 <- df |> group_by(x) |>
summarise(y = max(y))
ggplot(df, aes(x, y), col = x) +
geom_line(data = specials, aes(x = positions, y = maxes)) +
geom_point(size = 5) +
geom_point(data = df2,
pch = 21, fill = "red", color = "black", size = 5) +
theme_minimal() +
scale_x_discrete(limits = 1:6) +
scale_y_continuous(limits = c(0, 25)) +
labs(x = "Variable Name X", y = "Variable Name Y")
Output: