rdplyrnumericchr

Batch Convert Columns from chr to num with either read_excel or dplyr


I have a database saved in excel, and when I bring it into R there are many columns that should be numeric, but they get listed as characters. I know that in read_excel I can specify each column format using the col_types = "numeric", but I have > 500 columns, so this gets a bit tedious.

Any suggestions on how to do this either when importing with read_excel, or after with dplyr or something similar?

I can do this 1 by 1 using a function that I wrote but it still requires writing out each column name

convert_column <- function(data, col_name) {
  new_col_name <- paste0(col_name)
  data %>% mutate(!!new_col_name := as.numeric(!!sym(col_name)))
}

  convert_column("gFat_OVX") %>%
  convert_column("gLean_OVX")%>%
  convert_column("pFat_OVX") %>%
  convert_column("pLean_OVX")

I would ideally like to say "if a column contains the text "Fat" or "Lean" in the header, then convert to numeric", but I'm open to suggestions.

select(df, contains("Fat" | "Lean"))

I'm not sure how to make an example that allows people to test this out, given that we're starting with an excel sheet here.


Solution

  • dplyr::mutate and across may be a solution after reading in the data.

    Something like this, where df1 is your data frame from read_excel:

    library(dplyr)
    
    df1 <- df1 %>% 
      mutate(across(contains(c("Fat", "Lean")), ~as.numeric(.x)))