raggregatesummarydescribedescribe.by

How to generate five number summary using describe.by


I wonder how to call for five-number stats from describeBy() function as provided by summary(). After calling library(psych), the current function reports min, max and median, but not stats for 25th and 75th quartiles.

attach(mtcars)
describeBy(mpg, gear)

var  n  mean   sd median trimmed  mad  min  max range  skew kurtosis   se 

Appreciate your help in advance.


Solution

  • There is also a built-in base function specifically for the five numbers unsurprisingly called: fivenum

    aggregate(mpg ~ gear, data=mtcars, fivenum)
      gear mpg.1 mpg.2 mpg.3 mpg.4 mpg.5
    1    3 10.40 14.50 15.50 18.40 21.50
    2    4 17.80 21.00 22.80 28.85 33.90
    3    5 15.00 15.80 19.70 26.00 30.40
    

    EDIT to answer the followup Q in comments (as I interpret it) you can use the . in the formula to specify all other columns

    aggregate(.~gear, data=mtcars, fivenum)
    #too wide to print here
    

    Or if you just want the fivenum for all columns without the gear split then that is:

        apply(mtcars, 2,fivenum)
           mpg cyl   disp  hp  drat     wt   qsec vs am gear carb
    [1,] 10.40   4  71.10  52 2.760 1.5130 14.500  0  0    3    1
    [2,] 15.35   4 120.65  96 3.080 2.5425 16.885  0  0    3    2
    [3,] 19.20   6 196.30 123 3.695 3.3250 17.710  0  0    4    2
    [4,] 22.80   8 334.00 180 3.920 3.6500 18.900  1  1    4    4
    [5,] 33.90   8 472.00 335 4.930 5.4240 22.900  1  1    5    8