index/match equivalent in R for merging data frames in a tidy way

I have two data frames in R that share SOME columns and SOME rows in an arbitrary order.

I now want to create a third data frame that has the columns and rows of data frame 1 merged with the corresponding values from data frame 2:

df1 <- data.frame(name = c("a", "b", "c"),
                  col1 = c(NA, NA, NA),
                  col2 = c(NA, NA, NA),
                  col3 = c(NA, NA, NA))

df2 <- data.frame(name = c("d", "b", "a"),
                  col2 = c(1, 2, 3),
                  col1 = c(4, 5, 6),
                  col4 = c(NA, NA, NA))

My expected output is:

  name col1 col2 col3
1    a    6    3   NA
2    b    5    2   NA
3    c   NA   NA   NA

I know how I could achieve this step by step by first creating a df with common cols and rows from both data frames, sort accordingly etc.

There's also the (my currently) preferred option of first pivot_longer both data frames, then do a left_join, then pivot_wider.

However, I'm wondering if there is a more direct (and tidy) way similar to Excel's combination of index & match where I would match columns and rows and write the df2 values directly into the respective df1 cell?

Solution

That's pretty much what dplyr::rows_patch() does, with few safeguards (or limitations) :
- all columns in y(df2) must exist in x(df1)
- types of matching x & y columns should be compatible when casting from y to x.

For the former there's nothing that select(df2, any_of(names(df1)) can't fix. But regarding types, in the included example all-NA columns of df1 are logical (as class(NA) is "logical") while matching df2 columns are numerical and that would trigger a casting error:

Can't convert from y$col1 <double> to x$col1 <logical> due to loss of precision.

Which may or may not be relevant with your real data.

library(dplyr)

# change types for all-NA columns
glimpse(df1)
#> Rows: 3
#> Columns: 4
#> $ name <chr> "a", "b", "c"
#> $ col1 <lgl> NA, NA, NA
#> $ col2 <lgl> NA, NA, NA
#> $ col3 <lgl> NA, NA, NA
df1 <- mutate(df1, across(where( \(x) all(is.na(x) )), as.numeric)) 
glimpse(df1)
#> Rows: 3
#> Columns: 4
#> $ name <chr> "a", "b", "c"
#> $ col1 <dbl> NA, NA, NA
#> $ col2 <dbl> NA, NA, NA
#> $ col3 <dbl> NA, NA, NA

rows_patch(
  x = df1, 
  y = select(df2, any_of(names(df1))), 
  by = "name",
  unmatched = "ignore"
)
#>   name col1 col2 col3
#> 1    a    6    3   NA
#> 2    b    5    2   NA
#> 3    c   NA   NA   NA

If you do construct your target frames yourself, you could perhaps start with something like this instead:

df1 <- data.frame(name = c("a", "b", "c"), 
                  col1 = NA_real_, col2 = NA_real_, col3 = NA_real_)