regexrgsubstrsplit

R extract first number from string


I have a string in a variable which we call v1. This string states picture numbers and takes the form of "Pic 27 + 28". I want to extract the first number and store it in a new variable called item.

Some code that I've tried is:

item <- unique(na.omit(as.numeric(unlist(strsplit(unlist(v1),"[^0-9]+")))))

This worked fine, until I came upon a list that went:

[1,] "Pic 26 + 25"
[2,] "Pic 27 + 28"
[3,] "Pic 28 + 27"
[4,] "Pic 29 + 30"
[5,] "Pic 30 + 29"
[6,] "Pic 31 + 32"

At this point I get more numbers than I want, as it is also grabbing other unique numbers (the 25).

I've actually tried doing it with gsub, but got nothing to work. Help would be appreciated greatly!


Solution

  • I assume that you'd like to extract the first of two numbers in each string.

    You may use the stri_extract_first_regex function from the stringi package:

    library(stringi)
    stri_extract_first_regex(c("Pic 26+25", "Pic 1,2,3", "no pics"), "[0-9]+")
    ## [1] "26" "1"  NA