rdplyrlabelmutationhmisc

variables lose their labels after mutate() function is applied in R


so assume some of my df columns have labels.

library(Hmisc)
df1 <- data.frame(a = c(0,1,2), b = c(0,1,2), d = c(0,1,2), e = c(0,1,2),
                  f= c("m","f","o"), output = c(0,1,2))
var_labs <- c(a = "aaa",
              b = "bbb",
              #d = "ddd",
              #e = "eee",
              f = "fff",
              output = "ooo")
label(df1) <- as.list(var_labs[match(names(df1), names(var_labs))])

enter image description here

When I apply mutate() functions on any column, that column loses its label.

library(dplyr)
df2 <- df1 %>%
   mutate_if(is.character,
             .funs = as.factor)

enter image description here

Is there a way to keep the labels after mutating?


Solution

  • You could first save the labels of df1 and after that assign again the labels to matching columns to df2 like this:

    labs <- Hmisc::label(df1)
    library(dplyr)
    df2 <- df1 %>%
      mutate_if(is.character,
                .funs = as.factor)
    
    label(df2) <- as.list(labs[match(names(df2), names(labs))])
    str(df2)
    

    Output:

    'data.frame':   3 obs. of  6 variables:
     $ a     : 'labelled' num  0 1 2
      ..- attr(*, "label")= chr "aaa"
     $ b     : 'labelled' num  0 1 2
      ..- attr(*, "label")= chr "bbb"
     $ d     : 'labelled' num  0 1 2
      ..- attr(*, "label")= chr NA
     $ e     : 'labelled' num  0 1 2
      ..- attr(*, "label")= chr NA
     $ f     : Factor w/ 3 levels "f","m","o": 2 1 3
      ..- attr(*, "label")= chr "fff"
     $ output: 'labelled' num  0 1 2
      ..- attr(*, "label")= chr "ooo"
    

    view(df2):

    enter image description here