Assuming I have the following xts object with duplicated time information:
library(xts)
x <- xts(1:5,
c("2024-04-19", "2024-04-19", "2024-04-20", "2024-04-21", "2024-04-21") |> as.Date())
x
#> [,1]
#> 2024-04-19 1
#> 2024-04-19 2
#> 2024-04-20 3
#> 2024-04-21 4
#> 2024-04-21 5
Currently I'm simply discarding duplicate entries to "clean" the object in a rather naive way for further use/analysis:
ind <- zoo::index(x) |> duplicated()
x[!ind, ]
#> [,1]
#> 2024-04-19 1
#> 2024-04-20 3
#> 2024-04-21 4
I would like to expand this towards a more sophisticated approach (at least from my point of view) where I would be able to choose some common aggregation function to be applied on duplicated indices, returning an object of class xts, e.g.
xts_aggr_duplicates(x, "mean")
#> [,1]
#> 2024-04-19 1.5
#> 2024-04-20 3
#> 2024-04-21 4.5
xts_aggr_duplicates(x, "sum")
#> [,1]
#> 2024-04-19 3
#> 2024-04-20 3
#> 2024-04-21 9
My idea was to disassemble the complete object, aggregate where necessary and rbind again... But this would be pretty inefficient for large objects, I guess. Any ideas?
Use aggregate.zoo
. In the code below replace mean
with whatever function you prefer.
library(xts)
aggregate(x, c, mean) |> as.xts()
## [,1]
## 2024-04-19 1.5
## 2024-04-20 3.0
## 2024-04-21 4.5
If the way you got x
in the first place is reading it in from a file
then use read.zoo
.
write.zoo(x, "myfile.dat") # create test file
read.zoo("myfile.dat", aggregate = mean) |> as.xts()
## [,1]
## 2024-04-19 1.5
## 2024-04-20 3.0
## 2024-04-21 4.5
Input in reproducible form.
library(xts)
x <- xts(1:5,
as.Date(c("2024-04-19", "2024-04-19", "2024-04-20", "2024-04-21", "2024-04-21")))