rtime-seriesmissing-datalagautocorrelation

R imputation of missing value using autocorrelation


I'm trying to fill in 2 missing values. My lecturer suggests using the results of the largest autocorrelation with the following stages:

  1. Find minimal and maximum data from the dataset. Suppose I use AirPassanger data by omitting two data. The minimum data is 104 and the maximum data is 622.
  2. Calculations are performed for each combination of the two missing data by finding the autocorrelation (ACF) in lag 1. The experiment is carried out by replacing the two missing data with numbers in the minimum and maximum range (104 <= x <= 622).
  3. Imputation of missing data is selected based on the largest autocorrelation.
  4. The expected output is in the form of a matrix of imputed time series autocorrelation results.

I'm trying to calculate it using R, but the code I used found an error and I'm confused about continuing this. Here is the code

AirPassengers[43]<-NA
AirPassengers[100]<-NA
Fun_mv = function(g,h){
  g=104:622
  n=length(g)
  empty_matrix=matrix(nrow = n, ncol = n, dimnames = list(g,g))
  for (i in g){
    for (j in g){
      AirPassengers[43]=i
      AirPassengers[100]=j
      empty_matrix[i,j]=acf(AirPassengers)$acf[2]
    }
  }
}
h=outer(g,g,FUN = Fun_mv);h

Any help is greatly appreciated!

get the correct code


Solution

  • There is no need for a call to outer, the function's double loop already does it.
    Note that the assignment of NA to AirPassengers[43] is outside the inner loop. And that acf(., plot = FALSE) saves a lot of time.

    Fun_mv <- function(g, h){
      n <- length(g)
      empty_matrix <- matrix(nrow = n, ncol = n, dimnames = list(g, g))
      for (i in seq_along(g)){
        AirPassengers[43] <- g[i]
        for (j in seq_along(g)){
          AirPassengers[100] <- h[j]
          empty_matrix[i, j] <- acf(AirPassengers, plot = FALSE)$acf[2]
        }
      }
      empty_matrix
    }
    
    AirPassengers[43] <- NA
    AirPassengers[100] <- NA
    g <- 104:622
    h <- Fun_mv(g, g)
    
    str(h)
    #>  num [1:519, 1:519] 0.871 0.871 0.871 0.871 0.871 ...
    #>  - attr(*, "dimnames")=List of 2
    #>   ..$ : chr [1:519] "104" "105" "106" "107" ...
    #>   ..$ : chr [1:519] "104" "105" "106" "107" ...
    

    Created on 2023-05-13 with reprex v2.0.2