rquantile

Why are the minimums different in R for summary and minimum?


The data set used here is risk (probability) and the probabilities are very small.  When using the summary function in R, the following is obtained

> summary(prob_ann)

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
## 0.000e+00 1.000e-16 1.034e-13 3.959e-12 7.880e-13 8.222e-10

However, a query for the actual minimum yields the correct value:

> min(prob_ann)

## [1] 1.199446e-35

My question is this:  ¿why is summary using scientific notation, but still reporting a TRUE ZERO value instead of the correct value of 1.199e-35?


Update #1

Despite there being more than enough information to "debug" this question (as was demonstrated by the user who actually answered the question), someone "closed" this question because there wasn't enough information to reproduce the problem. Again, curious that this was the justification when the accepted answer clearly proved them wrong...which raises the question: ¿why was this question closed?

But, here is the "requested" code:

 set.seed(123)
 prob_ann <- c(1.199446e-35, runif(100, 3.33e-15, 9.99e-10))
 summary(prob_ann)
 min(prob_ann)
 quantile(prob_ann,probs=c(0,1))

Solution

  • It's not a TRUE zero value. The reason why the minimum shown by summary differs from the actual minimum is because of the class of the output value.

    set.seed(123)
    prob_ann <- c(1.199446e-35, runif(100, 0, 8.222e-10))
    
    res <- summary(prob_ann); res
         Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
    0.000e+00 2.003e-10 3.831e-10 4.059e-10 6.203e-10 8.175e-10
    
    min(prob_ann)
    [1] 1.199446e-35
    
    class(res)
    #[1] "summaryDefault" "table"
    

    The second last line of the summary.default function is:

    class(value) <- c("summaryDefault", "table")
    

    the first argument changes the formatting of the output due to the print.summaryDefault function:

    function (x, digits = max(3L, getOption("digits") - 3L), ...) 
    {
        xx <- x
        if (is.numeric(x) || is.complex(x)) {
            finite <- is.finite(x)
            xx[finite] <- zapsmall(x[finite])
        }
    ...
        print.table(xx, digits = digits, ...)
        invisible(x)
    }
    

    Thus, the output is being rounded (see zapsmall for proof).

    ?zapsmall
    

    zapsmall determines a digits argument dr for calling round(x, digits = dr) such that values close to zero (compared with the maximal absolute value in the vector) are ‘zapped’, i.e., replaced by 0.

    If you want to see the unformatted output, then, you can use unclass:

    unclass(res)
            Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
    1.199446e-35 2.066278e-10 4.407913e-10 4.176800e-10 6.351261e-10 8.195258e-10
    

    or use print.table instead:

    print.table(res)
            Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
    1.199446e-35 2.066278e-10 4.407913e-10 4.176800e-10 6.351261e-10 8.195258e-10