I have data like this:
sample_data <- data.frame(
txtnumbers = c("text stuff +300.5","other stuff 40+ more stuff","text here -30 here too","30- text here","50+","stuff here 500+","400.5-" ),
stringsAsFactors = F
)
I want to extract numbers where they are FOLLOWED by a + symbol and insert the values into a new column, ignoring the rest of the text and returning NA where there is no number followed by a +:
desired_data <- data.frame(
txtnumbers = c("text stuff +300.5","other stuff 40+ more stuff","text here -30 here too","30- text here","50+","stuff here 500+","400.5-" ),
desired_col = c(NA,40,NA,NA,50,500,NA),
stringsAsFactors = F
)
Can someone help me with an efficient function to do this? I could parse the number using parse_numeric but returning only numbers followed by a + is giving me issues. Thanks!
Here is one option using stringr::str_extract
stringr::str_extract(sample_data$txtnumbers, "(\\d+)\\+", group = 1)
#[1] NA "40" NA NA "50" "500" NA
Right now, they are extracted as strings. You may wrap as.integer
to turn them into numbers.