rdata-manipulation

Transform variables and add to dataframe


I would like to log transform several variables in a dataframe and to then add the transformed variables to the dataframe as new variables named using 'logoldname'. What are the best ways of doing these in R efficiently?

data("mtcars")
head(mtcars)

#Log transform - maunally
mtcars$logdisp <- log(mtcars$disp)
mtcars$loghp <- log(mtcars$hp)
mtcars$logwt <- log(mtcars$wt)
mtcars$logqsec <- log(mtcars$qsec)

Solution

  • Here is a tidyverse solution:

    # These are the columns with entries you'd like to log-transform
    ss <- c("disp", "hp", "wt", "qsec")
    
    mtcars %>%
        mutate_at(vars(one_of(ss)), funs(log = log(.))) %>%
        rename_at(vars(contains("_log")), funs(paste0("log_", gsub("_log", "", .)))) %>%
        select(contains("log_"))
    #   log_disp   log_hp    log_wt log_qsec
    #1  5.075174 4.700480 0.9631743 2.800933
    #2  5.075174 4.700480 1.0560527 2.834389
    #3  4.682131 4.532599 0.8415672 2.923699
    #4  5.552960 4.700480 1.1678274 2.967333
    #5  5.886104 5.164786 1.2354715 2.834389
    #6  5.416100 4.653960 1.2412686 3.006672
      
    

    Explanation: mutate_at selects columns that match ss and applies a log transformation. This generates new columns, named e.g. "disp_log", "hp_log" and so on. We then rename those columns into log_disp, log_hp, etc., and select only the log-transformed columns in the final step.