I want to plot multiple time-series variables in the same plot so that I can see how the time series lags line up. I will have a few different plots for different groups of data, and each of those plots will represent a different group of data, although each plot will have variables with the same name.
I could do it individually, one variable at a time. But I have like a hundred variables.
Here is an example of my dataset where lag
is the lag point (from the ccf()
) and each var
is the ACF of a different variable at that lag point.
sampledf <- data.frame(
lag = -10:10,
var1 = rnorm(21),
var2 = rnorm(21),
var3 = rnorm(21)
)
Now I could plot them fairly easily like this:
plot(sampledf$lag,
sampledf$var1,
type = "l",
col = 1,
xlab = "Lag",
ylab = "ACF")
lines(sampledf$lag,
sampledf$var2,
type = "l",
col = 2)
lines(sampledf$lag,
sampledf$var3,
type = "l",
col = 3)
legend("topright",
c("Var1", "Var2", "Var3"),
lty = 1,
col = 1:3)
But then I am doing each variable manually. And if I wanted to view the correlations in a different way-for example, one plot with var1
from each of sampledf1
, sampledf2
, through sampledf20
, I would have to start all over.
Is there a way to automate this in fewer lines of code? This is just beyond my level of R programming but I realise this probably has something to do with functions and things (R is mainly a "stats" tool in my work).
I'm also open to different functions to view cross-correlations in different ways if there's a completely different (yet easier) way to achieve this.
You will likely very quickly run out of colors, but you can do this:
plot(sampledf$lag,
sampledf$var1,
type = "n",
col = 1,
xlab = "Lag",
ylab = "ACF")
Map(function(y, col) lines(sampledf$lag, y, col=col, type="l"),
sampledf[,-1], seq_len(ncol(sampledf)-1))
legend("topright",
c("Var1", "Var2", "Var3"),
lty = 1,
col = 1:3)
This works much better in a "long" format.
library(ggplot2)
tidyr::pivot_longer(sampledf, -lag) |>
ggplot(aes(lag, value, color = name, group = name)) +
geom_line()