I use to calculate a short summary table this way:
library(data.table)
library(tidyverse)
mtcars.dt <- data.table(mtcars)
mtcars_grby <- mtcars.dt %>%
# group by cyl
group_by(cyl) %>%
# summarise
summarise(
# number observ
quantity = n(),
# average
mpg_avg = mean(mpg, na.rm = TRUE),
# min and max weight
wt_min = min(wt),
wt_max = max(wt))
mtcars_grby
# A tibble: 3 x 5
cyl quantity mpg_avg wt_min wt_max
<dbl> <int> <dbl> <dbl> <dbl>
1 4 11 26.7 1.51 3.19
2 6 7 19.7 2.62 3.46
3 8 14 15.1 3.17 5.42
Which is the shortest way to do this with data.table only? I only can imagine merging four different aggregations, but it is not elegant really.
Use .(...), keyby= and .N like this:
library(data.table)
mtcars.dt <- data.table(mtcars)
mtcars.dt[, .(quantity = .N,
mpg_avg = mean(mpg, na.rm = TRUE),
wt_min = min(wt),
wt_max = max(wt)),
keyby = cyl]
giving
Key: <cyl>
cyl quantity mpg_avg wt_min wt_max
<num> <int> <num> <num> <num>
1: 4 11 26.66364 1.513 3.190
2: 6 7 19.74286 2.620 3.460
3: 8 14 15.10000 3.170 5.424