I'm struggling to regularize a time series using the tsibble package. The documentation indicates that this can be done using index_by()
+ summarise()
, but I'm clearly missing some details. Here's what I've tried:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = FALSE)
# regularize the tsibble (ie time series)
tsbl %>%
index_by(date, unit = "day") %>% # unit value "day" is intuitive but incorrect?
mutate(week = isoweek(date)) %>% # add (numeric) week column
summarise(date = date,
fish = sum(fish),
volume = sum(volume),
n = sum(n),
cpue = fish/volume) # calculate catch per unit effort
TIA!
With so little information provided about what you are actually trying to do, I will have to guess.
Perhaps you want daily data with each day explicitly included. In that case, do this:
library(tidyverse)
library(lubridate)
library(tsibble)
# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = TRUE) %>%
fill_gaps()
tsbl
#> # A tsibble: 15 x 4 [1D]
#> date fish volume n
#> <date> <dbl> <dbl> <dbl>
#> 1 1976-05-18 203 210749 5
#> 2 1976-05-19 282 287555 7
#> 3 1976-05-20 NA NA NA
#> 4 1976-05-21 NA NA NA
#> 5 1976-05-22 NA NA NA
#> 6 1976-05-23 NA NA NA
#> 7 1976-05-24 301 378965 10
#> 8 1976-05-25 NA NA NA
#> 9 1976-05-26 NA NA NA
#> 10 1976-05-27 NA NA NA
#> 11 1976-05-28 NA NA NA
#> 12 1976-05-29 NA NA NA
#> 13 1976-05-30 NA NA NA
#> 14 1976-05-31 NA NA NA
#> 15 1976-06-01 89 308935 8
Created on 2022-05-20 by the reprex package (v2.0.1)
I'm not sure what you are trying to achieve with the summarize, but perhaps you want to create weekly data from these daily data. In that case, do this:
tsbl %>%
mutate(week = isoweek(date)) %>% # add (numeric) week column
index_by(week) %>%
summarise(fish = sum(fish, na.rm=TRUE),
volume = sum(volume, na.rm=TRUE),
n = sum(n, na.rm=TRUE),
cpue = fish/volume) # calculate catch per unit effort
#> # A tsibble: 3 x 5 [1]
#> week fish volume n cpue
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 485 498304 12 0.000973
#> 2 22 301 378965 10 0.000794
#> 3 23 89 308935 8 0.000288
Created on 2022-05-20 by the reprex package (v2.0.1)