raggregationsummarizetidyquant

Convert Daily Data into Weekly Data and summarize multiple columns in R


I want to change the following data set :

date          A   B
01/01/2018  391 585
02/01/2018  420 595
03/01/2018  455 642
04/01/2018  469 654
05/01/2018  611 900
06/01/2018  449 640
07/01/2018  335 522
08/01/2018  726 955
09/01/2018  676 938
10/01/2018  508 740
11/01/2018  562 778
12/01/2018  561 761
13/01/2018  426 609
14/01/2018  334 508

My desired output is as follows:

date           A       B
07/01/2018  3130    4538
14/01/2018  3793    5289

In which, the numbers of Columns A and B are summed over seven days of weeks. Indeed, I want to convert daily data to weekly data. I found two solutions on the Stackoverflow website. One solution would be using library(tidyquant), and following code

library(tidyquant)
newfd<-df %>%
  tq_transmute(select     = A,
               mutate_fun = apply.weekly,
               FUN        = sum)

The code produces weekly data for Column A while I need all columns. (I have many columns). I also used the following code. But, I don’t know how I can develop the code for all columns.

library(slider)   
slide_period_dfr(.x = califo, .i=as.Date(califo$date), 
                 .period = "week", 
                 .f = ~data.frame(week_ending = tail(.x$ date,1),
                                  week_freq = sum(.x$A)),
                 .origin = as.Date("2018-01-01"))

Solution

  • You can compute an index id = 0:(nrow(df) - 1), once the series is arranged by date, and use that to define to which period (week) belongs each date week = id %/% 7. Date linked to each week is chosen in date = max(date) to be the last date of the week. Other options are possible.

    library(dplyr)
    library(lubridate)
    
    df <- tribble(~date, ~A, ~B,
    "01/01/2018", 391, 585,
    "02/01/2018", 420, 595,
    "03/01/2018", 455, 642,
    "04/01/2018", 469, 654,
    "05/01/2018", 611, 900,
    "06/01/2018", 449, 640,
    "07/01/2018", 335, 522,
    "08/01/2018", 726, 955,
    "09/01/2018", 676, 938,
    "10/01/2018", 508, 740,
    "11/01/2018", 562, 778,
    "12/01/2018", 561, 761,
    "13/01/2018", 426, 609,
    "14/01/2018", 334, 508)
    
    df %>%
      mutate(date = dmy(date)) %>% 
      arrange(date) %>% 
      mutate(id = 0:(nrow(df) - 1), week = id %/% 7) %>%
      group_by(week) %>% 
      summarize(date = max(date), across(A:B, sum))
    
    #> # A tibble: 2 x 4
    #>    week date           A     B
    #>   <dbl> <date>     <dbl> <dbl>
    #> 1     0 2018-01-07  3130  4538
    #> 2     1 2018-01-14  3793  5289
    

    Created on 2021-06-05 by the reprex package (v0.3.0)