My goal is to create multiple dataframe columns as in the following example.
# Goal
mtcars %>%
rownames_to_column("model") %>%
mutate(MAZDA = case_when(grepl('^Mazda', model) ~ 1, TRUE ~ 0),
MERC = case_when(grepl('^Merc', model) ~ 1, TRUE ~ 0),
VOLVO = case_when(grepl('^Volvo', model) ~ 1, TRUE ~ 0) )
However, in my real-world application, I cannot be sure, how many columns are to be created. The number of case-when conditions may vary. This is why I want to either create a named list or a character value that contains the conditions based on the input data. However, one piece is missing: How can I pass a named list or a character value to dplyr’s mutate
verb? Both of the following examples do not work.
# Named list
condition1 <- list("case_when(grepl('^Mazda', model) ~ 1, TRUE ~ 0)",
"case_when(grepl('^Merc', model) ~ 1, TRUE ~ 0)",
"case_when(grepl('^Volvo', model) ~ 1, TRUE ~ 0)")
names(condition1) <- c("MAZDA", "MERC", "VOLVO")
result1 <- mtcars %>% mutate(!!!condition1)
# Character string
condition2 <-
"MAZDA = case_when(grepl('^Mazda', model) ~ 1, TRUE ~ 0),
MERC = case_when(grepl('^Merc', model) ~ 1, TRUE ~ 0),
VOLVO = case_when(grepl('^Volvo', model) ~ 1, TRUE ~ 0)"
result2 <- mtcars %>% mutate(eval(parse(text = condition2)))
How can I make them work? Is there an alternative base R approach to solve the problem?
1) Assuming that the conditions are of the form shown we can simplify the input to just the Names and then create the columns using map_dfr
and bind that to the original data frame.
library(dplyr)
library(purrr)
Names <- c("Mazda", "Merc", "Volvo") # input names
mtcars %>%
bind_cols(
rownames(.) %>%
{ map_dfr(set_names(Names, toupper), \(x) +startsWith(., x)) }
)
2) If the forms can differ creat4 a list of functions as shown and then apply them:
L2 <- list(MAZDA = \(x) +startsWith(x, "Mazda"),
MERC = \(x) +startsWith(x, "Merc"),
VOLVO = \(x) +startsWith(x, "Volvo"))
mtcars %>%
bind_cols(rownames(.) %>% { map_dfr(L2, \(f) f(.)) })
3) or if for some reason a character vector is needed then try the following (although to me (1) or (2) is preferable):
L3 <- list(MAZDA = '\\(x) +startsWith(x, "Mazda")',
MERC = '\\(x) +startsWith(x, "Merc")',
VOLVO = '\\(x) +startsWith(x, "Volvo")')
mtcars %>%
bind_cols(rownames(.) %>% { map_dfr(L3, \(f) eval(str2lang(f))(.)) })