I am trying to forecast by each group within a data frame (in this case LSOA), for the next 5 years. I have a data set of three columns: LSOA
, Date
and Value
. Similar to this:
LSOA | Date | Value |
---|---|---|
E01026449 | 31/03/2021 | 401 |
E01026449 | 31/03/2022 | 415 |
E01026449 | 31/03/2023 | 441 |
E01026450 | 31/03/2021 | 413 |
E01026450 | 31/03/2022 | 428 |
E01026450 | 31/03/2023 | 440 |
E01026451 | 31/03/2021 | 607 |
E01026451 | 31/03/2022 | 625 |
E01026451 | 31/03/2023 | 633 |
I have tried several nested lists solutions, none of which are working as it is just fitting the existing year values and I am not sure where to put the predict
and h=
to get the next x results.
My completely broken code below:
datamodel<-split(data[, -1], data$LSOA)
ld <- lapply(datamodel, function(x) {ts(c(t(x[,-2])),start = c(2010,3,31), frequency = 1)})
lest<-lapply(ld, function(x){holt(x)})
lts<- lapply(lest, function(x){predict(x, newdata=1)})
lts <- lapply(ld, holt, model = "nZZ")
I know I need to:
1.) Group by LSOA
2.) Develop a model for each group
3.) Apply model to prediction for the group
So ideally I would be able to predict and append the 31/03/2024
number for each LSOA
or set h
to some number of future predictions. But I am missing something silly here.
How can I achieve this all in a dplyr pipe?
You can use the map() function from purrr on the levels of LSOA.It creates a list with your sub data.tables. Then, using map again, you can throw any prediction on each one of your data.table, returning a list of your models.
library(tidyverse)
data=tribble(~LSOA,~Date,~Value,
"E01026449","31/03/2021",401,
"E01026449", "31/03/2022", 415,
"E01026449" , "31/03/2023" , 441,
"E01026450" ,"31/03/2021 " ,413,
"E01026450", "31/03/2022", 428,
"E01026450" ," 31/03/2023" , 440,
"E01026451" ,"31/03/2021" ,607,
"E01026451", "31/03/2022", 625,
"E01026451" , "31/03/2023" , 633,)
levels (as.factor(data$LSOA)) %>%
map(~{return(data %>% filter(LSOA==.x))}) %>%
map(~{#Insert prediction here
#For example
lm(Value~Date, data=.x)
# Will not work beacause of the formats
})
I didn't get what you wanted to predict, sorry.