I've got a GPS dataset with about 5600 rows of coordinates from 5 GPS devices ('nodes') over several days and I want to reduce the number of GPS points to just one point per hour. Because the number of points per hour fluctuates, a simple for-loop is not possible. A simplified structure of the table would be this:
ID node easting northing year month day hour minute time
The column 'time' is class "POSIXlt" "POSIXt"
.
Trying my first approach, a multiple nested for-loop, I learned about the Second circle of Inferno.
Does someone has any idea, how to reduce multiple rows (per hour) to one (per hour), separated by each device in R.
Assuming that the year, month, day, and time columns contain information related to the time column, the solution could be as follows:
# Generate data
md <- data.frame(
node = rep(1:5, each = 2)
, easting = sample(1:10, size = 20, replace = TRUE)
, northing = sample(1:10, size = 20, replace = TRUE)
, year = 2017
, month = "June "
, day = 6
, hour = rep(1:2, each = 2, times = 5)
, minute = NA
, time = NA
)
# Solution
library(dplyr)
md %>%
group_by(node, year, month, day, hour) %>%
summarize(
easting = mean(easting),
northing = mean(northing)
)