rfunctional-programmingdata.table

Functional programming with data.table


I want to make a simple function in R that works with data.frames/tibbles or with a data.table. So I created a new method:

pesquisarComposicao <- function(BASE, ...) {
  UseMethod("pesquisarComposicao")
}

Then I created the method for data.frames, which shows is a very simple searching function. This works very well.

pesquisarComposicao.data.frame <- function(BASE, TERMO, CAMPO = DESCRICAO) {`

  BASE |>
    filter(grepl(pattern = {{TERMO}}, x = {{CAMPO}}))
}

But the data.table method below is not working (I was following the instructions here):

pesquisarComposicao.data.table <- function(BASE, TERMO,
                                           CAMPO = "DESCRICAO DA COMPOSICAO") {

  # filter_col <- NULL
  # filter_val <- NULL

  BASE[filter_col %like% filter_val,
     env = list(
       filter_col = CAMPO,
       filter_val = I(TERMO)
     ), verbose = FALSE]
}

I have tried to insert filter_col = NULL and filter_val = NULL in order to avoid the following error:

Error in pesquisarComposicao.data.table(BASE = sinteticoSINAPI, TERMO = "PEITORIL") : object 'filter_col' not found

But then I obtained another error:

Error in grepl(pattern, vector, ignore.case = ignore.case, fixed = fixed, : invalid 'pattern' argument

I think it's weird, because when I was not yet using methods, but coded a function that should work for data.tables and simple data.frames (verified internally if the object was one of type or another), this piece of code above was working fine (and I didn't need the filter_col = NULL and filter_val = NULL). Why did it worked for a single function but it did not work as a method?


Solution

  • I think

    library(data.table)
    pesquisarComposicao <- function(BASE, ...) {
      UseMethod('pesquisarComposicao')
    }
    pesquisarComposicao.data.table  <- function(BASE, TERMO, CAMPO) {
      BASE[grepl(TERMO, BASE[[CAMPO]])] # non-lazy 
      
      # BASE[get(CAMPO) %like% TERMO] # get
      # BASE[grepl(TERMO, get(CAMPO))] # get
      
      # as of data.table 1.15.0 there is substitute2: 
      # BASE[grepl(pattern, x), env = I(list(pattern=TERMO, x=as.name(CAMPO)))]
      # BASE[grepl(pattern, x), env = list(pattern=I(TERMO), x=CAMPO)]
      # where `grepl`/%like% (which is like()) could have been passed as well. 
      # the docs are hard to read. 
    }
    

    does what you are after.

    > pesquisarComposicao(BASE=dt, TERMO='DESCRICAO DA COMPOSICAO', CAMPO='DESCRICAO')
           ID                        DESCRICAO
       <char>                           <char>
    1:      A DESCRICAO DA COMPOSICAO DE PRODU
    2:      C DESCRICAO DA COMPOSICAO COMPLETA
    

    Sample Data

    as OP doesn't provide some.

    dt = data.table::data.table(ID = LETTERS[1:4], DESCRICAO = c('DESCRICAO DA COMPOSICAO DE PRODU', 'INGREDIENTES NATURAIS', 'DESCRICAO DA COMPOSICAO COMPLETA', 'OUTRA COISA'))