I have camera trap data where I want to remove potentially duplicated animal detections. I am setting the interval as 60min (1 hour) so that any individual detected of the same species at the same camera is the same individual if it is detected again within <60min. My data is collected at multiple blocks with multiple sites within a block.
#Data example:
Block<-c("a","a","a","a","a","b","b","b","b","b") #2 block
Site<-c("p1","p1","p2","p2","p2","p1","p1","p1","p2","p2") #two sites
Species<-c("c","c","c","c","e","d","d","c","c","c")
datetime<-c("2021-03-29 05:45:00","2021-03-29 06:40:00","2021-03-30 05:45:00","2021-03-30 07:45:00","2021-03-29 09:45:00","2021-03-29 05:45:00","2021-03-29 05:55:00","2021-03-29 08:45:00","2021-03-29 10:45:00","2021-03-30 10:59:00")
df<-data.frame(Block, Site, Species, datetime)
#what I want for the output:
Block Site Species datetime
1 a p1 c 2021-03-29 05:45:00
2 a p2 c 2021-03-30 05:45:00
3 a p2 c 2021-03-30 07:45:00
4 a p2 e 2021-03-29 09:45:00
5 b p1 d 2021-03-29 05:45:00
6 b p1 c 2021-03-29 08:45:00
7 b p2 c 2021-03-29 10:45:00
8 b p2 c 2021-03-30 10:59:00
The tricky part, at least for me, is I need to remove duplicates of Block/Site/Species that appear within 1 hour of one another (I can't just say keep the first within in hour period of the day).
Thank you for your help.
Using diff
and then filter
by set intv
library(dplyr)
intv <- 60
df %>%
mutate(datetime = as.POSIXct(datetime)) %>%
arrange(Block, Site, Species, datetime) %>%
filter(c(intv + 1, diff(datetime, units="mins")) > intv,
.by = c(Block, Site, Species))
Block Site Species datetime
1 a p1 c 2021-03-29 05:45:00
2 a p2 c 2021-03-30 05:45:00
3 a p2 c 2021-03-30 07:45:00
4 a p2 e 2021-03-29 09:45:00
5 b p1 c 2021-03-29 08:45:00
6 b p1 d 2021-03-29 05:45:00
7 b p2 c 2021-03-29 10:45:00
8 b p2 c 2021-03-30 10:59:00