Consider the following simplified data frame:
df<-data.frame(x1=c("A","B","C"),x2=c("K to B","K to B","K to B"))
I want to replace strings in x2 with NA (or "") in the rows where the x1 character cannot be found as part of x2. That is, the data frame should be corrected to:
df_corrected<-data.frame(x1=c("A","B","C"),x2=c("NA","K to B","NA"))
The actual dataset contains 95000 rows and many different expressions in x2. I've otherwise used Tidyverse to clean the data. I have tried using grepl() for searching for the x1 value in the x2-string, however I am having an issue doing this iteratively for each row (need function/forward loop?) and combining it with mutate(). I am also open to other options if better (e.g. sapply and base R? Or sqldf?)
Thanks a lot in advance!
Another way, which doesn't use rowwise()
:
library(dplyr)
df |> mutate(x2 = if_else(str_detect(x2, x1), x2, NA))