I have a database saved in excel, and when I bring it into R there are many columns that should be numeric, but they get listed as characters. I know that in read_excel I can specify each column format using the col_types = "numeric", but I have > 500 columns, so this gets a bit tedious.
Any suggestions on how to do this either when importing with read_excel, or after with dplyr or something similar?
I can do this 1 by 1 using a function that I wrote but it still requires writing out each column name
convert_column <- function(data, col_name) {
new_col_name <- paste0(col_name)
data %>% mutate(!!new_col_name := as.numeric(!!sym(col_name)))
}
convert_column("gFat_OVX") %>%
convert_column("gLean_OVX")%>%
convert_column("pFat_OVX") %>%
convert_column("pLean_OVX")
I would ideally like to say "if a column contains the text "Fat" or "Lean" in the header, then convert to numeric", but I'm open to suggestions.
select(df, contains("Fat" | "Lean"))
I'm not sure how to make an example that allows people to test this out, given that we're starting with an excel sheet here.
dplyr::mutate
and across
may be a solution after reading in the data.
Something like this, where df1
is your data frame from read_excel
:
library(dplyr)
df1 <- df1 %>%
mutate(across(contains(c("Fat", "Lean")), ~as.numeric(.x)))