rstringinstr

Is there an equivalent of vb.net's instr in R


I am rewriting my vb.net code in R and have come to a roadblock. The code in vb.net essentially counts the number of characters in a string that do not occur in a string of allowed characters. The code in vb.net is:

StringtoConvert="ABC"
strAllowedChars="AC"
For i= 1 to len(StringtoConvert)
  If InStr(1, strAllowedChars, StringtoConvert(i))=0 then
  disallowed=disallowed+1
  Else
  End If
Next

I can see how to do this in R using loops to search the string for each of the allowed characters but is there a way in R to do this using an aggregate like the strAllowedChars above?

The str_count function of the stringr package in R is the closest that I have found but it looks matches to the entire strAllowedChars rather than looking at each character independently. How can I test the StringtoConvert to make sure it contains only the strAllowedChars as individual characters. In other words in the example above if a character in StringtoConvert does not match one of the characters in strAllowedCharacters then I need to either identify it as such and use another call to replace it or replace it directly.

The R code that I have tried is:

    library(stringr)
    testerstring<-"CYA"
    testpattern<-"CA"
    newtesterstring<-str_count(testerstring,testpattern)
    print(newtesterstring)

The desired output is the number of characters in the StringtoConvert that are disallowed based on the allowed characters-strAllowedChars. I will then use that in a loop to change any disallowed character to a "G" using an if then statement so it would also be desirable if I could skip the step of counting and instead just replace any disallowed character with a "G".


Solution

  • Here's an approach with str_replace_all. We can generate a regular expression to identify characters that are not in a set. For example, [^AC] matches any characters not A or C:

    library(stringr)
    StringtoConvert="ABC"
    strAllowedChars="AC"
    str_replace_all(StringtoConvert,paste0("[^",strAllowedChars,"]"),"G")
    #[1] "AGC"
    
    set.seed(12345)
    sample(LETTERS,50,replace = TRUE) %>% paste(collapse = "") -> StringtoConvert2
    str_replace_all(StringtoConvert2,paste0("[^",strAllowedChars,"]"),"G")
    #[1] "GGGGGGGGGGGGGGGGGGAGGGGGCGGGGGGGGGGGGGGGGGGGGGGGGG"