rregexgsub

gsub return an empty string when no match is found


I'm using the gsub function in R to return occurrences of my pattern (reference numbers) on a list of text. This works great unless no match is found, in which case I get the entire string back, instead of an empty string. Consider the example:

data <- list("a sentence with citation (Ref. 12)",
             "another sentence without reference")

sapply(data, function(x) gsub(".*(Ref. (\\d+)).*", "\\1", x))

Returns:

[1] "Ref. 12"                            "another sentence without reference"

But I'd like to get

[1] "Ref. 12"                            ""

Thanks!


Solution

  • I'd probably go a different route, since the sapply doesn't seem necessary to me as these functions are vectorized already:

    fun <- function(x){
        ind <- grep(".*(Ref. (\\d+)).*",x,value = FALSE)
        x <- gsub(".*(Ref. (\\d+)).*", "\\1", x)
        x[-ind] <- ""
        x
    }
    
    fun(data)