I have a basic "regression with ARIMA errors" model. It has only two coefficients: one for the covariate and one for an AR1 term.
My client wants me to code the model in Excel (groan).
I can't, however, seem to reproduce the fitted values from the coefficients. Please see this reprex. What am I doing wrong? See how fitted
is different from fitted2
. Thank you.
library(tidyverse)
library(forecast)
#> Registered S3 method overwritten by 'quantmod':
#> method from
#> as.zoo.data.frame zoo
dat <- structure(list(y = c(1.78991103434018, 1.49879933964446, 0.971918281919186,
0.937808354585886, 1.04805962874144, 0.951319584533088, 0.816977148299322,
0.919107705606102, 0.904677832125957, 0.993996660003134, 1.09903781760707,
1.13452389281912, 1.20954064355947, 1.21259769951234, 1.35795597184542
), x = c(109.523396102789, 82.4545796349974, 22.0326643499854,
57.0307677821589, 68.9460376594862, 52.5101835657671, 41.7426519049455,
54.9933617116061, 42.9692305014876, 55.5927443079887, 62.0082700546652,
56.9579188098397, 58.7357687813304, 61.9314235502785, 78.0286524773229
)), row.names = c(NA, -15L), class = c("tbl_df", "tbl", "data.frame"
))
model <- auto.arima(
ts(dat$y, frequency = 12),
xreg = as.matrix(dat[, "x"])
)
summary(model)
#> Series: ts(dat$y, frequency = 12)
#> Regression with ARIMA(1,0,0) errors
#>
#> Coefficients:
#> ar1 intercept x
#> 0.8399 0.7652 0.0073
#> s.e. 0.1485 0.1675 0.0012
#>
#> sigma^2 = 0.0101: log likelihood = 14.24
#> AIC=-20.49 AICc=-16.49 BIC=-17.66
#>
#> Training set error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set -0.01916312 0.08988014 0.06145025 -2.555815 5.787479 0.1471733
#> ACF1
#> Training set 0.1275706
dat$fitted <- model |>
fitted() |>
as.numeric()
dat$fitted2 <- NA_real_
for (i in 2:nrow(dat)) {
dat$fitted2[i] <-
coef(model)["intercept"] +
coef(model)["ar1"] * dat$y[i - 1] +
coef(model)["x"] * dat$x[i]
}
dat
#> # A tibble: 15 × 4
#> y x fitted fitted2
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1.79 110. 1.67 NA
#> 2 1.50 82.5 1.56 2.87
#> 3 0.972 22.0 1.04 2.19
#> 4 0.938 57.0 1.22 2.00
#> 5 1.05 68.9 1.06 2.06
#> 6 0.951 52.5 0.963 2.03
#> 7 0.817 41.7 0.904 1.87
#> 8 0.919 55.0 0.955 1.85
#> 9 0.905 43.0 0.871 1.85
#> 10 0.994 55.6 1.03 1.93
#> 11 1.10 62.0 1.07 2.05
#> 12 1.13 57.0 1.08 2.11
#> 13 1.21 58.7 1.16 2.15
#> 14 1.21 61.9 1.23 2.24
#> 15 1.36 78.0 1.33 2.36
Created on 2025-07-07 with reprex v2.1.1
You've misunderstood the model. It is a regression with AR(1) errors, not an ARX model. See https://robjhyndman.com/hyndsight/arimax/ for the details.
The model you have fitted can be written as:
𝑦𝑡=𝛽0+𝛽1𝑥+𝜂𝑡
𝜂𝑡=𝜙𝜂𝑡−1+𝜀𝑡
So the fitted values are 𝛽0 + 𝛽1𝑥 + 𝜙𝜂𝑡-1
The following code shows the calculation.
dat |>
mutate(
fitted = fitted(model),
res = residuals(model, type = "regression"),
fitted2 = coef(model)["intercept"] +
coef(model)["x"] * x +
coef(model)["ar1"] * c(NA, res[-15])
)
#> # A tibble: 15 × 5
#> y x fitted res fitted2
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1.79 110. 1.67 0.222 NA
#> 2 1.50 82.5 1.56 0.129 1.56
#> 3 0.972 22.0 1.04 0.0452 1.04
#> 4 0.938 57.0 1.22 -0.246 1.22
#> 5 1.05 68.9 1.06 -0.223 1.06
#> 6 0.951 52.5 0.963 -0.199 0.963
#> 7 0.817 41.7 0.904 -0.254 0.904
#> 8 0.919 55.0 0.955 -0.249 0.955
#> 9 0.905 43.0 0.871 -0.176 0.871
#> 10 0.994 55.6 1.03 -0.179 1.03
#> 11 1.10 62.0 1.07 -0.121 1.07
#> 12 1.13 57.0 1.08 -0.0483 1.08
#> 13 1.21 58.7 1.16 0.0137 1.16
#> 14 1.21 61.9 1.23 -0.00670 1.23
#> 15 1.36 78.0 1.33 0.0206 1.33
R
actually uses a state space model with a diffuse prior so it is possible to compute the fitted value for the first time point as well, but the calculation is too complicated to easily reproduce in a few lines of code. So I've left it missing.