rstatisticspackagesummarytools

Odd differences between the base R summary() and summarytools descr() function results


I have a vector of numeric data (sample below). Let's store the vector as x. When I run summary(x) and descr(x), where descr() is from the summarytools package, I have agreement on the Min, Median, Mean, and Max values. However, my 1st & 3rd quartile values differ. This is the first time I've seen this discrepancy between the two function results. Any thoughts as to why and how this happens?

I started exploring the descr() source code, but haven't gotten far nor have I been able to access summary() source to see if therein is the difference. However, when looking at some of the cumulative percentages, I think there might be a difference in how they are calculated the quantiles.

x = c(1132.1, 731.1, 851.2, 704.0, 226.3, 1703.6, 853.6, 821.4, 1192.9, 814.2, 880.2, 1270.8, 784.2, 606.5, 702.8, 863.6, 419.2, 1486.9, 1325.8, 493.2, 847.7, 552.5, 709.3, 508.3, 400.0, 711.4, 1161.5, 778.4, 626.2, 365.0, 329.1, 457.7, 446.2, 564.1, 376.9, 463.3, 239.7, 250.9, 266.5, 298.2, 186.2, 79.0, 149.9, 178.7, 79.4, 91.8, 12.6)
install.packages("")
library(summarytools)
descr(x)
summary(x)

With descr() Q1= 298.20 and Q3= 847.70 With summary() Q1= 313.6 and Q3= 834.5

When I run freq(x) and look at the cumulative percentage, 298.2 is at 25.53%, 821.4 is at 74.47%, and 847.7 is at 76.6%. So it looks like descr() might be listing the x vector's values that are closest to but not under the 1st & 3rd quartile.

(821.4+847.7)/2 = 834.5
This matches the summary 3rd quartile which is not a vector value but closer to the estimated cumulative 75%. Still not sure how summary() obtains 313.6 for the 1st quartile.


Solution

  • Look at the help page for ?quantile. There are multiple different ways of calculating quantiles, descr() is using type = 2 and summary is using the default of type = 7:

    > quantile(x, type = 2)
        0%    25%    50%    75%   100% 
      12.6  298.2  564.1  847.7 1703.6 
    > quantile(x, type = 7)
         0%     25%     50%     75%    100% 
      12.60  313.65  564.10  834.55 1703.60