I have two .csv files called 'tag.csv' and 'wind.csv' that I would like to merge based on the 'Date' column. However, as you can see from the data below the times do not match exactly.
tag.csv
Date
13/12/2014 05:11
13/12/2014 05:43
13/12/2014 06:34
wind.csv
Date
13/12/2014 05:00
13/12/2014 06:00
13/12/2014 07:00
I am using the following simple script.
tag<- read.csv("tag.csv")
wind<- read.csv("wind.csv")
myfulldata = merge(tag, wind)
Is there a command I could add so that the data will merge with the closest time? For example, 13/12/2014 05:11 will merge with 13/12/2014 05:00
data.table
package has its own merging method and an option called roll
that allows closest matches. e.g.
library(data.table)
dt1 <- data.table(a = c(1,2,3,4), b=1:4)
dt2 <- data.table(a = c(1.6,2.1,3.2,4.9), c=1:4)
setkeyv(dt1, "a")
setkeyv(dt2, "a")
dt1[dt2, roll="nearest"]
See Join R data.tables where key values are not exactly equal--combine rows with closest times for another example.
To be robust though, I would probably instead do some datetime manipulation to force them to both be to the nearest hour, and then do a nearest join if all else fails
edit: this method won't work on data.frame
objects, if you haven't used data.table
before maybe stay simple and round the hours