I would like to merge two data sets that contain vectors in columns in R. Here I provide example data for my problem:
df1<- data.frame(place = c("a", "b", "c"))
df1$stops <- list(c(1,2,3), c(2,3,4), c(3,4,5))
df2 <- data.frame(stops = c(1,2,3,4,5))
df2$trips <- list(c(343,353), c(492, 4332), c(375, 346), c(110, 109), c(267, 268))
The trips should be matched to a place if their stop is on the 'stops' column on list1. My desired result would be a table with 3 rows and 3 columns (place, stops, trips). Each row represents a place and all the stops and trips associated to that place. Row 1 would look like:
place stops trips
1 "a" 1, 2, 3 343, 353, 492, 4332, 375, 346
Unfortunately, merging/joining doesn't work directly on nested lists like this, so we need to unnest and join, optionally re-nesting.
library(dplyr)
library(tidyr) # unnest
df1 %>%
unnest(stops) %>%
left_join(df2, by = "stops") %>%
summarize(stops = list(stops), trips = list(unlist(trips)), .by = "place") %>%
as.data.frame()
# place stops trips
# 1 a 1, 2, 3 343, 353, 492, 4332, 375, 346
# 2 b 2, 3, 4 492, 4332, 375, 346, 110, 109
# 3 c 3, 4, 5 375, 346, 110, 109, 267, 268
The trailing as.data.frame()
is not required, I included it here merely to show the contents a little more transparently, vice dplyr's tendency to "pretty-ify" things (and hide the nested contents).