rmetricsrocglmnetauc

R Metrics auc() error message


I'm trying to calculate auc but have a weird problem. When I run this script:

rm(list = ls(all = T))
gc()

library(Metrics)
library(glmnet)

nrows <- 92681
set.seed(456)
df1 <- data.frame(act1 = round(runif(nrows), 0), pred1 = runif(nrows))

Metrics::auc(df1$act1, df1$pred1)
glmnet::auc(df1$act1, df1$pred1)

I get:

> Metrics::auc(df1$act1, df1$pred1)
[1] 0.4930949
> glmnet::auc(df1$act1, df1$pred1)
[1] 0.4930949

When I add one more row and run this:

rm(list = ls(all = T))
gc()

library(Metrics)
library(glmnet)

nrows <- 92682
set.seed(456)
df1 <- data.frame(act1 = round(runif(nrows), 0), pred1 = runif(nrows))

Metrics::auc(df1$act1, df1$pred1)
glmnet::auc(df1$act1, df1$pred1)

I get :

> Metrics::auc(df1$act1, df1$pred1)
[1] NA
Warning message:
In n_pos * n_neg : NAs produced by integer overflow
> glmnet::auc(df1$act1, df1$pred1)
[1] 0.5011554

Any idea what's going on here?


Solution

  • Metrics::auc uses a formula which includes the value (n_pos * n_neg) in the denominator, which in this case is 'sum(actual == 1) * sum(actual == 0)' which evaluate to integers 46308 * 46374 = 2147487192, which exceeds the largest integer you machine can handle (i.e. .Machine$integer.max).

    For example:

    46308 * 46374
    #> 2147487192
    
    as.integer(46308) * as.integer(46374)
    #> [1] NA
    #> Warning message:
    #> In as.integer(46308) * as.integer(46374) : NAs produced by integer overflow