regexr

Test if elements in a character string could be numeric


I want to test a character string and see which elements could actually be numeric. I can use regex to test for integer successful but am looking to see which elements have all digits and 1 or less decimals. Below is what I've tried:

x <- c("0.33", ".1", "3", "123", "2.3.3", "1.2r")
!grepl("[^0-9]", x)   #integer test

grepl("[^0-9[\\.{0,1}]]", x)  # I know it's wrong but don't know what to do

I'm looking for a logical output so I'd expect the following results:

[1] TRUE TRUE TRUE TRUE FALSE FALSE

Solution

  • Maybe there's a reason some other pieces of your data are more complicated that would break this, but my first thought is:

    > !is.na(as.numeric(x))
    [1]  TRUE  TRUE  TRUE  TRUE FALSE FALSE
    

    As noted below by Josh O'Brien this won't pick up things like 7L, which the R interpreter would parse as the integer 7. If you needed to include those as "plausibly numeric" one route would be to pick them out with a regex first,

    x <- c("1.2","1e4","1.2.3","5L")
    > x
    [1] "1.2"   "1e4"   "1.2.3" "5L"   
    > grepl("^[[:digit:]]+L",x)
    [1] FALSE FALSE FALSE  TRUE
    

    ...and then strip the "L" from just those elements using gsub and indexing.