rstringextractrscript

How to extract somo character after a string with a number of word which can change in R


I would like to extract a couple of characters (numbers in this case), which go after a string of letters that can change their length (for instance, between 1 and 3). For example:

animals<-c('B02420','SS9874','MZ990122','HRB1281','NO2451068') 

Here, I would like to obtain this:

digits<-c(02,98,99,12,24)

I don't know if it exists a simple way to get them.


Solution

  • gsub("^[A-Z]+([0-9]{2}).*", "\\1", animals)
    [1] "02" "98" "99" "12" "24"
    

    explanation

    ^[A-Z]+ from the start of the string a sequence of capital letters
    ([0-9]{2}) capture a sequence of 2 exactly two digits to group 1
    .* the rest
    \\1 group 1