r

Inserting outputs from a list into a data frame?


This is a problem I am trying to work on.

Here is what I tried.

First, simulate the data:

library(forecast)
library(ggplot2)
    set.seed(123)


n <- 1000  
random_data <- rnorm(n, mean = 0, sd = 1)
random_ts <- ts(random_data, frequency = 12)  

Next, I fit the model

    fit <- auto.arima(random_ts)

Series: random_ts 
ARIMA(0,0,0)(0,0,1)[12] with zero mean 

Coefficients:
         sma1
      -0.0792
s.e.   0.0305

sigma^2 = 0.9771:  log likelihood = -1406.89
AIC=2817.79   AICc=2817.8   BIC=2827.6

Here is where I am stuck: How do I simulate 100 trajectories from this model and record the average? Do I need arima.sim() for this?

I thought this could be done the same way as forecasting?

enter image description here

Thanks!


My own attempt:

library(forecast)
library(ggplot2)

set.seed(123)


fit <- auto.arima(random_ts)

simulate_forecast <- function(model, h) {
    sim <- simulate(model, nsim = h)
    return(as.numeric(sim))
}

n_sims <- 100
h <- 100
simulations <- replicate(n_sims, simulate_forecast(fit, h))

avg_prediction <- rowMeans(simulations)

time <- seq_len(n + h)
original_data <- c(as.numeric(random_ts), rep(NA, h))
sim_data <- rbind(matrix(NA, nrow = n, ncol = n_sims), simulations)
avg_data <- c(rep(NA, n), avg_prediction)

plot_data <- data.frame(
    time = rep(time, n_sims + 2),
    value = c(original_data, as.vector(sim_data), avg_data),
    type = factor(rep(c("Original", rep("Simulation", n_sims), "Average"), each = n + h))
)

 ggplot(plot_data, aes(x = time, y = value, group = interaction(type, rep(1:(n_sims+2), each = n + h)), color = type)) +
    geom_line(aes(alpha = type)) +
    scale_color_manual(values = c("Original" = "blue", "Simulation" = "red", "Average" = "black")) +
    scale_alpha_manual(values = c("Original" = 1, "Simulation" = 0.02, "Average" = 1)) +  #
    theme_minimal() +
    labs(title = "Original Data, 100 Simulations, and Average Prediction",
         x = "Time",
         y = "Value",
         color = "Type") +
    theme(legend.position = "bottom")

Solution

  • You can use forecast::simulate to simulate the trajectories you want. Each trajectory is a time series. Then, replicate 100 times, compute the row means and coerce to class "ts".

    library(forecast)
    #> Registered S3 method overwritten by 'quantmod':
    #>   method            from
    #>   as.zoo.data.frame zoo
    set.seed(123)
    
    n <- 1000L
    random_data <- rnorm(n, mean = 0, sd = 1)
    random_ts <- ts(random_data, frequency = 12)  
    fit <- auto.arima(random_ts)
    fit
    #> Series: random_ts 
    #> ARIMA(0,0,0)(0,0,1)[12] with zero mean 
    #> 
    #> Coefficients:
    #>          sma1
    #>       -0.0792
    #> s.e.   0.0305
    #> 
    #> sigma^2 = 0.9771:  log likelihood = -1406.89
    #> AIC=2817.79   AICc=2817.8   BIC=2827.6
    
    R <- 100L
    traj <- replicate(R, simulate(fit, nsim = n))
    str(traj)
    #>  num [1:1000, 1:100] -0.8955 -1.1597 -0.0684 -0.0456 -2.5704 ...
    
    mean_traj <- rowMeans(traj) |> ts(frequency = 12L)
    plot(mean_traj)
    

    Created on 2024-09-06 with reprex v2.1.0