Can I override `$` or `[[` to throw an error instead of NULL when asking for a missing list element?

My hunch is this is an abuse of the R language and there's a good reason this doesn't happen. But I find this to be a perpetual source of insidious errors in code that I'm trying to debug:

MWE

list.1 <- list(a=1,b=2,c=list(d=3))
list.2 <- list(b=4,c=list(d=6,e=7))
input.values <- list(list.1,list.2)
do.something.to.a.list <- function(a.list) {
    a.list$b <- a.list$c$d + a.list$a
    a.list
}
experiment.results <- lapply(input.values,do.something.to.a.list)

use.results.in.some.other.mission.critical.way <- function(result) {
    result <- result^2
    patient.would.survive.operation <- mean(c(-5,result)) >= -5
    if(patient.would.survive.operation) {
        print("Congrats, the patient would survive! Good job developing a safe procedure.")
    } else {
        print("Sorry, the patient won't make it.")
    }
}

lapply(experiment.results, function(x) 

use.results.in.some.other.mission.critical.way(x$b))

YES I am aware this is a stupid example and that I could just add a check for the existence of the element before trying to access it. But I'm not asking to know what I could do, if I had perfect memory and awareness at all times, to work slowly around the fact that this feature is inconvenient and causes me lots of headache. I'm trying to avoid the headache altogether, perhaps at the cost of code speed.

So: what I want to know is...

(a) Is it possible to do this. My initial attempt failed, and I got stuck trying to read the C internals for "$" to understand how to handle the arguments correctly

(b) If so, is there a good reason not to (or to) do this.

Basically, my idea is that instead of writing every single function that depends on non-null return from list access to check really carefully, I can write just one function to check carefully and trust that the rest of the functions won't get called with unmet preconditions b/c the failed list access will fail-fast.

Solution

You can override almost anything in R (except certain special values - NULL, NA, NA_integer_ NA_real_ NA_complex_, NA_character_, NaN, Inf, TRUE, FALSE as far as I'm aware).

For your specific case, you could do this:

`$` <- function(x, i) {
  if (is.list(x)) {
    i_ <- deparse(substitute(i))
    x_ <- deparse(substitute(x))
    if (i_ %in% names(x)) {
      eval(substitute(base::`$`(x, i)), envir = parent.frame())
    } else {
      stop(sprintf("\"%s\" not found in `%s`", i_, x_))
    }
  } else {
    eval(substitute(base::`$`(x, i)), envir = parent.frame())
  }
}

`[[` <- function(x, i) {
  if (is.list(x) && is.character(i)) {
    x_ <- deparse(substitute(x))
    if (i %in% names(x)) {
      base::`[[`(x, i)
    } else {
      stop(sprintf("\"%s\" not found in `%s`", i, x_))
    }
  } else {
    base::`[[`(x, i)
  }
}

Example:

x <- list(a = 1, b = 2)
x$a
#[1] 1
x$c
#Error in x$c : "c" not found in `x`
col1 <- "b"
col2 <- "d"
x[[col1]]
#[1] 2
x[[col2]]
#Error in x[[col2]] : "d" not found in `x`

It will slow your code down quite a bit:

microbenchmark::microbenchmark(x$a, base::`$`(x, a), times = 1e4)
#Unit: microseconds
#            expr    min     lq     mean median      uq      max neval
#             x$a 77.152 81.398 90.25542 82.814 85.2915 7161.956 10000
# base::`$`(x, a)  9.910 11.326 12.89522 12.033 12.3880 4042.646 10000

I've limited this to lists (which will include data.frames) and have implemented selection with [[ by numeric and character vectors, but this may not fully represent the ways in which $ and [[ can be used.

Note for [[ you could use @rawr's simpler code:

`[[` <- function(x, i) if (is.null(res <- base::`[[`(x, i))) simpleError('NULL') else res

but this will throw an error for a member of a list which is NULL rather than just not defined. e.g.

x <- list(a = NULL, b = 2)
x[["a"]]

This may of course be what is desired.