I have a very long dataset and a relatively short list of ID values for which my data is wrong. The following works, but my wrong_IDs vector is actually much larger:
wrong_IDs <- c('A1', 'B3', 'B7', 'Z31')
df$var1[df$var2 == 'A1' | df$var2 == 'B3' | df$var2 == 'B7' | df$var2 == 'Z31'] <- 0L
This looks very basic but I haven't found a compact way of writing this. Thanks for any help
You can compare your data to the wrong_IDs with the %in% operator
df <- data.frame("var1" = 101:120, "var2" = c(1:20))
wrong_ids <- c(3, 5, 7)
df$var1[df$var2 %in% wrong_ids] <- 0
where df$var2 %in% wrong_ids provides you a TRUE/FALSE boolean vector that applies only the "set to zero" operation on the selected rows (here row 3, 5 and 7).