rvectorairaverage

How to convert Hourly data to daily average for two variables?


I am working on observing PM 10 data on a daily average for 3 years. I have already worked on my air quality data however, my temperature and wind speed data are hourly. How can I calculate the daily average and combine both of these data?

Eg:

date temperature Wind_Speed
1/01/2018 1:00 18.8 1.9
1/01/2018 2:00 18.7 1.5
1/01/2018 3:00 18.6 1.6
1/01/2018 4:00 18.5 1.0
1/01/2018 5:00 18.1 1.2

... all the way until 1/1/2021 00:00 (26304 obs).

I tried

StartDateTime = dmy_hm("-01-01-2018 01:00")
EndDateTime <- dmy_hm("01-01-2021 00:00")
DF <- data.frame(DateTime = c(seq.POSIXt(StartDateTime, EndDateTime, by = "hour"),
                              seq.POSIXt(StartDateTime, EndDateTime, by = "hour")),
                 check.rows = TRUE, fix.empty.names = TRUE, 
              )
head(DF)

My pm10 data has is a time series [1:1096]- atomic vector. Any suggestions on how I can change the hourly data into daily average for both temperature and wind speed so I can combine it with the pm10 data?

data: dput(head(sample_data_1))


Solution

  • I'm assuming your data looks like DF2 (and you use lubridate)

    library(tidyverse)
    library(lubridate)
    StartDateTime <- dmy_hm("-01-01-2018 01:00")
    EndDateTime <- dmy_hm("01-01-2021 00:00")
    date_hour_seq <- seq.POSIXt(StartDateTime, EndDateTime, by = "hour")
    DF2 <- data.frame(DateTime = date_hour_seq,
                      temperatur = rnorm(length(date_hour_seq), 10),
                      wind = rnorm(length(date_hour_seq), 1, .1))
    

    Then you can create a new date variable in group_by and calculate the average using summarise:

    DF2 %>% group_by(date = date(DateTime)) %>%
      summarise(temperatur = mean(temperatur), 
                wind = mean(wind))
    #> # A tibble: 1,097 × 3
    #>    date       temperatur  wind
    #>    <date>          <dbl> <dbl>
    #>  1 2018-01-01       9.98 1.03 
    #>  2 2018-01-02       9.85 1.00 
    #>  3 2018-01-03       9.88 1.02 
    #>  4 2018-01-04      10.3  1.02 
    #>  5 2018-01-05       9.83 0.979
    #>  6 2018-01-06      10.1  1.00 
    #>  7 2018-01-07       9.74 0.984
    #>  8 2018-01-08       9.96 0.997
    #>  9 2018-01-09      10.3  1.01 
    #> 10 2018-01-10      10.1  0.987
    #> # … with 1,087 more rows
    

    Created on 2022-06-10 by the reprex package (v2.0.1)