rloopsdataframepadr

How can I use PAD function (from PADR() package) for multiple data frames?


I have 24 files (1 for each hour of the day, HR_NBR = Hour Number) and I've to pad the dates in each of the files.

AS-IS data:

CLNDR_DT    HR_NBR  QTY
01/07/2016  1   6
03/07/2016  1   10

TO-BE data:

CLNDR_DT    HR_NBR  QTY
01/07/2016  1   6
02/07/2016  NA  NA
03/07/2016  1   10

I can use the pad function for each file, like this:

chil_bev1_1 = pad (chil_bev1_1, interval= "day") # Hour1
chil_bev1_2 = pad (chil_bev1_2, interval= "day") # Hour2

and so on.

And it works. But I want to use a loop or LAPPLY.

I tried several variations of these 2 codes, but none of them worked:

df1 = data.frame (chil_bev1_1)
df2 = data.frame (chil_bev1_2)
dflist = c("df1","df2")

CODE1:

x = function(df) {df %>% pad}
allpad = lapply(dflist,x)

CODE2:

x = function(df) {pad (df)}

allpad = lapply(dflist,x)

The error is

"x must be a data frame".

I'm new to R. Any help would be greatly appreciated.

Thank you.


Solution

  • I managed to figure it out. Here's the answer:

    hour_list = list(chil_bev1_1, chil_bev1_2)
    chil_bev1n = lapply (hour_list, function (x) {x %>% complete(CLNDR_DT = seq.Date(min(CLNDR_DT), max(CLNDR_DT), by="day"), fill = list(QTY=0))})
    

    Notes:

    The fill = list() function replaces the NAs with 0s.

    CLNDR_DT is the name of the column that contains dates.