rdatetimetime-seriesinterpolation

Generating a time series in 10 Hz


I like to interpolate a time series, so that the timestamp is exact 0.1 Hz. So the first step maybe would be something like

library(tidyverse) 
library(zoo)

options(digits.secs = 3, pillar.sigfig = 6)

data <- tribble(
  ~timestamp, ~value, 
  "09/12/2024 00:05:35.677", 139.664,
  "09/12/2024 00:05:35.776", 138.706,
  "09/12/2024 00:05:35.876", 143.348,
  "09/12/2024 00:05:35.975", 141.516,
  "09/12/2024 00:05:36.074", 136.731,
  "09/12/2024 00:05:36.174", 138.275,
  "09/12/2024 00:05:36.273", 143.015) %>%
  mutate(timestamp = mdy_hms(timestamp))

start <- min(data$timestamp) %>% round_date("0.1 sec")
end   <- max(data$timestamp) %>% round_date("0.1 sec")

data_10Hz <- data %>%
  complete(timestamp = seq.POSIXt(start, end, by = .100)) %>%
  arrange(timestamp) 

data_10Hz

#   timestamp                 value
#   <dttm>                    <dbl>
# 1 2024-09-12 00:05:35.677 139.664
# 2 2024-09-12 00:05:35.700  NA    
# 3 2024-09-12 00:05:35.776 138.706
# 4 2024-09-12 00:05:35.799  NA    
# 5 2024-09-12 00:05:35.875 143.348
# 6 2024-09-12 00:05:35.900  NA    
# 7 2024-09-12 00:05:35.974 141.516
# 8 2024-09-12 00:05:36.000  NA    
# 9 2024-09-12 00:05:36.073 136.731
#10 2024-09-12 00:05:36.100  NA    
#11 2024-09-12 00:05:36.174 138.275
#12 2024-09-12 00:05:36.200  NA    
#13 2024-09-12 00:05:36.273 143.015

data_10Hz <- data_10Hz  %>%
  mutate(value = na.approx(value)) %>%
  filter(timestamp == round_date(timestamp, "0.1 sec"))

data_10Hz

#   timestamp                 value
#   <dttm>                    <dbl>
# 1 2024-09-12 00:05:35.700 139.185
# 2 2024-09-12 00:05:35.799 141.027
# 3 2024-09-12 00:05:35.900 142.432
# 4 2024-09-12 00:05:36.000 139.123
# 5 2024-09-12 00:05:36.200 140.645

But that's not completely 10 Hz (problems with the internal representation of numbers) and probably slow for large data sets.

Do you know an efficient/cleaner way to do such interpolations?

Best wishes Christof


Solution

  • Linear interpolation can be done with the approx function from the stats package and is compatible with datetimes.

    A possible issue I see with your code is that you are calling zoo::na.approx without an x argument. I believe it is guessing the spacing by this logic from the function documentation:

    By default the index associated with object is used for interpolation.

    It seems likely that my solution would give the same values as zoo::na.approx if you used an x variable in that function call, but calculate faster as it doesn't require a complete and a filter.

    library(tidyverse) 
    
    options(digits.secs = 3, pillar.sigfig = 6)
    
    data <- tribble(
      ~timestamp, ~value, 
      "09/12/2024 00:05:35.677", 139.664,
      "09/12/2024 00:05:35.776", 138.706,
      "09/12/2024 00:05:35.876", 143.348,
      "09/12/2024 00:05:35.975", 141.516,
      "09/12/2024 00:05:36.074", 136.731,
      "09/12/2024 00:05:36.174", 138.275,
      "09/12/2024 00:05:36.273", 143.015) %>%
      mutate(timestamp = mdy_hms(timestamp))
    
    start <- min(data$timestamp) %>% round_date("0.1 sec")
    end   <- max(data$timestamp) %>% round_date("0.1 sec")
    
    data_10_hz <- tibble(
      timestamp = seq(start, end, by = 0.1),
      value = approx(data$timestamp, data$value, timestamp)$y # interpolation with `approx`
    )
    
    data_10_hz
    #> # A tibble: 6 × 2
    #>   timestamp                 value
    #>   <dttm>                    <dbl>
    #> 1 2024-09-12 00:05:35.700 139.441
    #> 2 2024-09-12 00:05:35.799 139.820
    #> 3 2024-09-12 00:05:35.900 142.904
    #> 4 2024-09-12 00:05:36.000 140.308
    #> 5 2024-09-12 00:05:36.100 137.132
    #> 6 2024-09-12 00:05:36.200 139.520
    

    Created on 2024-09-20 with reprex v2.1.0