I do not understand why
grepl("see*", "file SEC", ignore.case = TRUE)
returns TRUE
?
I am trying to find all words that start with see
, such as See
, seeing
, seen
, etc. and remove them.
The string above "file SEC" does not have such a word, yet TRUE
is returned.
Use a word boundary (\\b
)
The pattern "see*"
checks for "se" followed by any number of "e"
s (e*
)(including zero), so "SE" matches.
I believe you may want to look into something like this, without the "*"
grepl("^see", "file SEC", ignore.case = TRUE)
FALSE
In addition to the "^" sign, you can also include a word boundary \\b
, so you can detect words that start with the pattern, but exclude those that do not, inside multi-word characters:
grepl("\\bSee", c("file SEC", "See", "seeing", "seen", "he was seen", "He did not forsee the event"), ignore.case = TRUE)
[1] FALSE TRUE TRUE TRUE TRUE FALSE