rquantilequantreg

R quantreg model does not reproduce quantiles: Why?


I am using the quantreg package to predict quantiles and their confidence intervals. I can't understand why the predicted quantiles are different from the quantiles calculated directly from the data using quantile().

library(tidyverse)
library(quantreg)

data <- tibble(data=runif(10)*10)
qr1 <- rq(formula=data ~ 1, tau=0.9, data=data) #  quantile regression
yqr1<- predict(qr1, newdata=tibble(data=c(1)), interval='confidence', level=0.95, se='boot') # predict quantile
q90 <- quantile(data$data, 0.9) # quantile of sample

> yqr1
       fit    lower   higher
1 6.999223 3.815588 10.18286
> q90
     90% 
7.270891

Solution

  • You should realize the predicting the 90th percentile for a dataset with only 10 items is really based solely on the two highest values. You should review the help page for quantile where you will find multiple definitions of the term.

    When I run this, I see:

     yqr1<- predict(qr1, newdata=tibble(data=c(1)) ) 
     yqr1
           1 
    8.525812 
    

    And when I look at the data I see:

    data
    # A tibble: 10 x 1
             data
            <dbl>
     1 8.52581158
     2 7.73959380
     3 4.53000680
     4 0.03431813
     5 2.13842058
     6 5.60713159
     7 6.17525537
     8 8.76262959
     9 5.30750304
    10 4.61817190
    

    So the rq function is estimating the second highest value as the 90th percentile, which seems perfectly reasonable. The quantile result is not actually estimated that way:

    quantile(data$data, .9)
    #     90% 
    #8.549493 
    ?quantile