rtime-seriesmissing-dataimputationimputets

Strange behavior of the na.kalman function from the R imputeTS package


I am experimenting with functions from the imputeTS package. This package provides several functions to impute missing values in univariate time series data. I tested them and they all great, except the na_kalman function. This function changes the original numeric vector. Below is an example.

# Load packages
library(imputeTS)

# Set seeds
set.seed(123)

# Generate 10 random number
dat <- rnorm(10)

# Replace the first 10 numbers to be NA
dat[1:5] <- NA

# Check the numbers in dat
dat
 [1]         NA         NA         NA         NA         NA  1.7150650  0.4609162 -1.2650612 -0.6868529
[10] -0.4456620

As you can see, I created a vector with 10 numbers while the first 5 are NA.

# Apply the na_kalman function
dat2 <- na_kalman(dat)

# Check the numbers in dat2
dat2
[1]  1.7150650  1.7150650  1.7150650  1.7150650  1.7150650  1.7150650  0.4609162 -1.2650612 -0.6868529
[10] -0.4456620

# Check the numbers in dat again
dat
     [1]  1.7150650  1.7150650  1.7150650  1.7150650  1.7150650  1.7150650  0.4609162 -1.2650612 -0.6868529
[10] -0.4456620

It seems like the dat2 shows the na_kalman function successfully imputed the NA. However, the original vector, dat, was also changed. This is a behavior I want to avoid. I would like to know if there is a way to ask na_kalman not to change the original vector.

Note

  1. When I changed the vector length to a large number, such as rnorm(1000), I notice that all the missing values in dat will be changed to the first non-missing values in the original data. It seems like dat is not simply a copy of dat2 after the na_kalman function.

  2. I also tested other functions from the imputeTS package, such as na_interpolation, na_locf, na_mean. They don't have this behavior. dat remains to be the same vector after running those function.


Solution

  • Author of imputeTS package here. Thanks for your e-mail.

    This is indeed no feature...it is rather a small bug. I directly fixed this bug.

    Update: New Version with fix also on CRAN now. Fixed with version 3.0. Just update the imputeTS package, if you encounter this bug.

    Unfortunately I uploaded a new package version on CRAN just hours before you wrote me. Otherwise it would have been already included in the 2.1 update. I will make a update with the bugfix included by the end of the week.

    If you need a fixed version meanwhile, you can install the new version directly from github:

    library(devtools)
    install_github("SteffenMoritz/imputeTS")
    

    For the ones interested what the problem was:

    It was a problem with C++ Code I call via Rcpp. I forgot to make a deep copy of an object.