I have a dataframe as shown below and the desired output is shown below.
df <- data.frame(
col1 = c("abc_1_102", "abc_1_103", "xyz_1_104")
)
selection <- data.frame(col1 =c("abc", "xyz"),col2 =c("102", "106"))
Desired output
col1 col9
1 abc_1_102 SELECT
2 abc_1_103 NOTSELECT
3 xyz_1_104 NOTSELECT
How can we achieve this using grepl() function in R
What I have tried?
df$col2 <- ifelse(grepl(paste("^", selection$col1, "$", collapse = "|"), df$col1)&
grepl(paste("^", selection$col2, "$", collapse = "|"), df$col1),
"SELECT", "NOTSELECT")
print(df)
col1 col2
1 abc_1_102 NOTSELECT
2 abc_1_103 NOTSELECT
3 xyz_1_104 NOTSELECT
Here, the result is incorrect as in the selection the values in column1 and column2 match the first and third element of value in row 1 of df.
A more complete example would be preferable, but from what I understand, you want to see if there are any of the rows of selection
(with a separator of _1_
) in df
, with the column being "SELECT" when TRUE and "NOTSELECT" otherwise.
df$col9 <- ifelse(grepl(paste(selection$col1, selection$col2, sep = ".*", collapse = "|"), df$col1), "SELECT", "NOTSELECT")
Notes:
.*
with _1_