rdataframelabel-encoding

Label Encoding on multiple columns in R


I have a dataframe that contains columns that have categorical responses. I'd like to perform label enconding of the observations on all the columns at a go

Gender <- c("Male", "Female", "Female", "Male","Male")
School <- c("Primary", "Secondary", "Tertiary", "Primary","Secondary")
Town <- c("HA", "CA", "DD", "HA", "CA")

DF <- data.frame(Gender, School, Town)

So far, I'm able to do this repetitively. e.g.

DF$gender_num <- as.numeric(factor(DF$Gender))
DF$sch_num <- as.numeric(factor(DF$School))
DF$town_num <- as.numeric(factor(DF$Town))

However, I'd like an R code that maybe loops(?) all the columns and performs label encoding. This is because the dataframe I have contains 33 columns that need this feature.

How do I go about this?


Solution

  • mutate_if() is another option.

    DF <- DF |> 
      mutate_if(is.character, as.factor) |>
      mutate_if(is.factor, as.numeric)