rdplyrtidyversepurrr

Rename variables via lookup table in R


I have a dataframe in a certain order:

df <- 
  data.frame(
    foo = 1:3,
    bar = LETTERS[1:3],
    baz = rnorm(3)
  )

df

  foo bar         baz
1   1   A  0.41474174
2   2   B -0.08416768
3   3   C -0.27931232

In another dataframe, I have the old variable names matched to some new names, but in a different order:

variable_match <- 
  data.frame(
    old = names(df)[c(2, 3, 1)], 
    new = LETTERS[1:3]
  )

variable_match
  old new
1 bar   A
2 baz   B
3 foo   C

My question is: How do I rename the variables in the original dataframe by looking up the corresponding value in the second dataframe. I'm ideally looking for a tidyverse solution. I have tried variations of:

library(tidyverse)

df %>% rename_at(variable_match$old, funs(variable_match$new))

assuming that rename_at would be the right approach, but this doesn't work. I'm wondering if purrr::map_* would be the right approach, but can't see how. Many thanks for your suggestions.


Solution

  • Here is a one-line base solution:

    names(df2) = variable_match$new[match(names(df), variable_match$old)]
    

    It may not be "ideal" for you (it doesn't need the tidyverse to work), but it is simple and doesn't require loading any extra packages, instead relying on common built-in functions.


    As noted in comments, if you prefer a nested statement with pipes (aren't pipes intended to improve readability and prevent nesting?) the simple line above is equivalent to

    library(purrr)
    library(dplyr)
    library(magrittr)
    df = df %>%
        set_names(
            var_match %>%
            pull(new) %>%
            extract(
                names(df) %>% 
                match(var_match$old)
            )
        )
    

    I'm a big fan of pipes and dplyr - I use them consistently when the make things simpler and more readable. In this case they take a simple line and turn it into a programming puzzle, both in how to write it and how to read it.

    A nicer interface overall is the data.table::setnames function. If you convert to a datatable, then the code is setnames(df, old = var_match$old, new = var_match$new). This is robust in case not all names are changed (see comments below).