I like to interpolate a time series, so that the timestamp is exact 0.1 Hz. So the first step maybe would be something like
library(tidyverse)
library(zoo)
options(digits.secs = 3, pillar.sigfig = 6)
data <- tribble(
~timestamp, ~value,
"09/12/2024 00:05:35.677", 139.664,
"09/12/2024 00:05:35.776", 138.706,
"09/12/2024 00:05:35.876", 143.348,
"09/12/2024 00:05:35.975", 141.516,
"09/12/2024 00:05:36.074", 136.731,
"09/12/2024 00:05:36.174", 138.275,
"09/12/2024 00:05:36.273", 143.015) %>%
mutate(timestamp = mdy_hms(timestamp))
start <- min(data$timestamp) %>% round_date("0.1 sec")
end <- max(data$timestamp) %>% round_date("0.1 sec")
data_10Hz <- data %>%
complete(timestamp = seq.POSIXt(start, end, by = .100)) %>%
arrange(timestamp)
data_10Hz
# timestamp value
# <dttm> <dbl>
# 1 2024-09-12 00:05:35.677 139.664
# 2 2024-09-12 00:05:35.700 NA
# 3 2024-09-12 00:05:35.776 138.706
# 4 2024-09-12 00:05:35.799 NA
# 5 2024-09-12 00:05:35.875 143.348
# 6 2024-09-12 00:05:35.900 NA
# 7 2024-09-12 00:05:35.974 141.516
# 8 2024-09-12 00:05:36.000 NA
# 9 2024-09-12 00:05:36.073 136.731
#10 2024-09-12 00:05:36.100 NA
#11 2024-09-12 00:05:36.174 138.275
#12 2024-09-12 00:05:36.200 NA
#13 2024-09-12 00:05:36.273 143.015
data_10Hz <- data_10Hz %>%
mutate(value = na.approx(value)) %>%
filter(timestamp == round_date(timestamp, "0.1 sec"))
data_10Hz
# timestamp value
# <dttm> <dbl>
# 1 2024-09-12 00:05:35.700 139.185
# 2 2024-09-12 00:05:35.799 141.027
# 3 2024-09-12 00:05:35.900 142.432
# 4 2024-09-12 00:05:36.000 139.123
# 5 2024-09-12 00:05:36.200 140.645
But that's not completely 10 Hz (problems with the internal representation of numbers) and probably slow for large data sets.
Do you know an efficient/cleaner way to do such interpolations?
Best wishes Christof
Linear interpolation can be done with the approx
function from the stats
package and is compatible with datetimes.
A possible issue I see with your code is that you are calling zoo::na.approx
without an x
argument. I believe it is guessing the spacing by this logic from the function documentation:
By default the index associated with object is used for interpolation.
It seems likely that my solution would give the same values as zoo::na.approx
if you used an x
variable in that function call, but calculate faster as it doesn't require a complete
and a filter
.
library(tidyverse)
options(digits.secs = 3, pillar.sigfig = 6)
data <- tribble(
~timestamp, ~value,
"09/12/2024 00:05:35.677", 139.664,
"09/12/2024 00:05:35.776", 138.706,
"09/12/2024 00:05:35.876", 143.348,
"09/12/2024 00:05:35.975", 141.516,
"09/12/2024 00:05:36.074", 136.731,
"09/12/2024 00:05:36.174", 138.275,
"09/12/2024 00:05:36.273", 143.015) %>%
mutate(timestamp = mdy_hms(timestamp))
start <- min(data$timestamp) %>% round_date("0.1 sec")
end <- max(data$timestamp) %>% round_date("0.1 sec")
data_10_hz <- tibble(
timestamp = seq(start, end, by = 0.1),
value = approx(data$timestamp, data$value, timestamp)$y # interpolation with `approx`
)
data_10_hz
#> # A tibble: 6 × 2
#> timestamp value
#> <dttm> <dbl>
#> 1 2024-09-12 00:05:35.700 139.441
#> 2 2024-09-12 00:05:35.799 139.820
#> 3 2024-09-12 00:05:35.900 142.904
#> 4 2024-09-12 00:05:36.000 140.308
#> 5 2024-09-12 00:05:36.100 137.132
#> 6 2024-09-12 00:05:36.200 139.520
Created on 2024-09-20 with reprex v2.1.0