For a sample dataframe df
, pred_value
and real_value
respectively represent the monthly predicted values and actual values for a variable, and acc_level
represents the accuracy level of the predicted values comparing with the actual values for the correspondent month, the smaller the values are, more accurate the predictions result:
df <- structure(list(date = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L), .Label = c("2022/3/31", "2022/4/30",
"2022/5/31"), class = "factor"), pred_value = c(2721.8, 2721.8,
2705.5, 2500, 2900.05, 2795.66, 2694.45, 2855.36, 2300, 2799.82,
2307.36, 2810.71, 3032.91), real_value = c(2736.2, 2736.2, 2736.2,
2736.2, 2736.2, 2759.98, 2759.98, 2759.98, 2759.98, 3000, 3000,
3000, 3000), acc_level = c(1L, 1L, 2L, 3L, 3L, 1L, 2L, 2L, 3L,
2L, 3L, 2L, 1L)), class = "data.frame", row.names = c(NA, -13L
))
Out:
date pred_value real_value acc_level
1 2022/3/31 2721.80 2736.20 1
2 2022/3/31 2721.80 2736.20 1
3 2022/3/31 2705.50 2736.20 2
4 2022/3/31 2500.00 2736.20 3
5 2022/3/31 2900.05 2736.20 3
6 2022/4/30 2795.66 2759.98 1
7 2022/4/30 2694.45 2759.98 2
8 2022/4/30 2855.36 2759.98 2
9 2022/4/30 2300.00 2759.98 3
10 2022/5/31 2799.82 3000.00 2
11 2022/5/31 2307.36 3000.00 3
12 2022/5/31 2810.71 3000.00 2
13 2022/5/31 3032.91 3000.00 1
I've plotted the predicted values with code below:
library(ggplot2)
ggplot(x, aes(x=date, y=pred_value, color=acc_level)) +
geom_point(size=2, alpha=0.7, position=position_jitter(w=0.1, h=0)) +
theme_bw()
Out:
Beyond what I've done above, if I hope to plot the actual values for each month with red line and red points, how could I do that? Thanks.
Reference:
How to add 4 groups to make Categorical scatter plot with mean segments?
We can add the actuals using additional layers. To make the line show up, we need to specify that the points should be part of the same series.
ggplot assumes by default that since the x axis is discrete that the data points are not part of the same group. We could alternatively deal with this by making the date
variable into a date data type, like with aes(x=as.Date(date)...
library(ggplot2)
ggplot(df, aes(x=date, y=pred_value, color=as.factor(acc_level))) +
geom_point(size=2, alpha=0.7, position=position_jitter(w=0.1, h=0)) +
geom_point(aes(y = real_value), size=2, color = "red") +
geom_line(aes(y = real_value, group = 1), color = "red") +
scale_color_manual(values = c("yellow", "magenta", "cyan"),
name = "Acc Level") +
theme_bw()