I want to create a loop that extracts data from a time period before then moving the time period by one day and repeating the analysis. The main problem i'm having is how to do this with a time period rather than just one day. I have introduced a julian day column to try and make it easier (ie now its just a sequence of numbers rather than dates) however I still can't quite figure it out.
here is some example data:
Date Nor_MM Julianday
6441 2090-06-01 22.58582 152
6442 2090-06-02 20.43654 153
6443 2090-06-03 17.37954 154
6444 2090-06-04 18.12772 155
6445 2090-06-05 19.53053 156
6446 2090-06-06 23.25154 157
6447 2090-06-07 24.52292 158
6448 2090-06-08 24.83597 159
6449 2090-06-09 24.67915 160
6450 2090-06-10 24.22688 161
structure(list(Date = structure(c(2090-01, 43982, 43983, 43984,
43985, 43986, 43987, 43988, 43989, 43990), class = "Date"), Nor_MM = c(22.58582103,
20.43654256, 17.37954095, 18.12772066, 19.53053131, 23.25153522,
24.52291687, 24.83597434, 24.67915157, 24.22688304), Julianday = c(152,
153, 154, 155, 156, 157, 158, 159, 160, 161)), row.names = 6441:6450, class = "data.frame")
I want the total number of days within a 16 day period which exceed 20 degrees. (ie the total sum of days between 2090-06-01 (julian day 152) and 2090-06-16 (julian day 168) which exceed 20 degrees. This, I have calculated using the code below.
df1<-filter_time(df, time_formula = '2090-06-01' ~ '2090-06-16')
sum(df1$Nor_MM>=20)
The problem is that I want this number for all possible 16 day periods within the data set (of which there are 75). I cannot figure out how to loop something through a time period whereby the period moves by 1 day with each iteration. I need code which completes the above, before then moving the time period along by 1 day and completing it for 2090-06-02 to 2090-06-17. And repeats for all periods in the dataframe.
Any help with this would be amazing, thankyou!
rollapplyr
(note r on the end which means right aligned) performs a rolling operation (here sum
) using the ith width, i.e. number of positions to sum over, for the ith data component. The arguments are the data, the widths to sum over and the function, i.e. sum
.
findInterval(Date-16, Date)
finds the position of the Date 16 days back or the highest date less than that if no such date. Subtracting that from the position gives the number of days to consider.
If you are using tidyverse optionally replace transform
with mutate
.
library(zoo)
transform(df,
ndays = rollapplyr(data = Nor_MM >= 20,
width = seq_along(Date) - findInterval(Date - 16, Date),
FUN = sum))
giving
Date Nor_MM Julianday ndays
6441 1975-09-21 22.58582 152 1
6442 2090-06-02 20.43654 153 1
6443 2090-06-03 17.37954 154 1
6444 2090-06-04 18.12772 155 1
6445 2090-06-05 19.53053 156 1
6446 2090-06-06 23.25154 157 2
6447 2090-06-07 24.52292 158 3
6448 2090-06-08 24.83597 159 4
6449 2090-06-09 24.67915 160 5
6450 2090-06-10 24.22688 161 6
In the sample data there is a gap between the first date and the second date but if in your real data there are no gaps then it is even easier since the width can be specified as 16 and partial=TRUE can be used to specify that it should use whatever number of elements are available when fewer are available.
# if no gaps in dates
transform(df, ndays = rollapplyr(Nor_MM >= 20, 16, sum, partial = TRUE))