rregexsubsetdigitsmutate

Extract the last one or two digits of a coding variable with str_sub in R


I need to subset either the last one or two digits (replicates) of a coding variable (v1), that goes from 1 to 12. With the str_sub function, I cannot get the whole number when there are two digits. If I choose the last two digits, then the function gets the preceeding letter in one-digit replicates:

v1<-c("D018BG1","D018BG2","D018BG3","D018BG4","D018BG5","D018BG6","D018BG7","D018BG8","D018BG9",
                             "D018BG10","D018BG11","D018BG12")
df<-data.frame(v1)
df
v1
1   D018BG1
2   D018BG2
3   D018BG3
4   D018BG4
5   D018BG5
6   D018BG6
7   D018BG7
8   D018BG8
9   D018BG9
10 D018BG10
11 D018BG11
12 D018BG12

df%>%
     mutate(replicate=str_sub(v1,-1,-1))
v1 replicate
1   D018BG1         1
2   D018BG2         2
3   D018BG3         3
4   D018BG4         4
5   D018BG5         5
6   D018BG6         6
7   D018BG7         7
8   D018BG8         8
9   D018BG9         9
10 D018BG10         0
11 D018BG11         1
12 D018BG12         2


df%>%
     mutate(replicate=str_sub(v1,-2,-1))
v1 replicate
1   D018BG1        G1
2   D018BG2        G2
3   D018BG3        G3
4   D018BG4        G4
5   D018BG5        G5
6   D018BG6        G6
7   D018BG7        G7
8   D018BG8        G8
9   D018BG9        G9
10 D018BG10        10
11 D018BG11        11
12 D018BG12        12
 

How can this be done?

Thanks in advance!


Solution

  • You can use the str_extract function from {stringr}.

    library(dplyr)
    library(stringr)
    
    df %>%
      mutate(replicate_one_digit = as.numeric(str_extract(v1, "\\d$")),
             replicate_two_digits = as.numeric(str_extract(v1, "\\d{1,2}$")))
    
             v1 replicate_one_digit replicate_two_digits
    1   D018BG1                   1                    1
    2   D018BG2                   2                    2
    3   D018BG3                   3                    3
    4   D018BG4                   4                    4
    5   D018BG5                   5                    5
    6   D018BG6                   6                    6
    7   D018BG7                   7                    7
    8   D018BG8                   8                    8
    9   D018BG9                   9                    9
    10 D018BG10                   0                   10
    11 D018BG11                   1                   11
    12 D018BG12                   2                   12