rdplyrtolower

Using set_names vs. mutate(colnames) when changing data frame column names to lower case


A quick question that I was looking to understand better.

Data:

df1 <- data.frame(COLUMN_1 = letters[1:3], COLUMN_2 = 1:3)

> df1
  COLUMN_1 COLUMN_2
1        a        1
2        b        2
3        c        3

Why does this work in setting data frame names to lower case:

df2 <- df1 %>%
  set_names(., tolower(names(.))) 

> df2
  column_1 column_2
1        a        1
2        b        2
3        c        3

But this does not?

df2 <- df1 %>%
  mutate( colnames(.) <-  tolower(colnames(.)) )

Error: Column `colnames(.) <- tolower(colnames(.))` must be length 3 (the number of rows) or one, not 2

Solution

  • The solution, writing the arguments out explicitly, is:

    df1 %>% rename_all(tolower) == rename_all(.tbl = df1, .funs = tolower)

    mutate operates on the data itself, not the column names, so that's why we're using rename. We use rename_all because you don't want to type out 1 = tolower(1), 2 = tolower(2), ...

    What you suggested, df2 <- df1 %>% rename_all(tolower(.)) doesn't work because then you would be trying to feed the whole df1 into the tolower function, which is not what you want.