I'm a new user of expss
package. I have a question about how to add a column to show difference.
For example I have an expss
result as below. I would like to add a column to show the difference between V-engine
and Straight engine
(V-engine - Straight engine
)
Code:
library(expss)
data(mtcars)
mtcars = apply_labels(mtcars,
mpg = "Miles/(US) gallon",
cyl = "Number of cylinders",
disp = "Displacement (cu.in.)",
hp = "Gross horsepower",
drat = "Rear axle ratio",
wt = "Weight (1000 lbs)",
qsec = "1/4 mile time",
vs = "Engine",
vs = c("V-engine" = 0,
"Straight engine" = 1),
am = "Transmission",
am = c("Automatic" = 0,
"Manual"=1),
gear = "Number of forward gears",
carb = "Number of carburetors"
)
mtcars %>%
tab_cells(cyl, disp) %>%
tab_cols(vs) %>%
tab_stat_mean() %>%
tab_pivot()
| | | Engine | |
| | | V-engine | Straight engine |
| --------------------- | ---- | -------- | --------------- |
| Number of cylinders | Mean | 7.4 | 4.6 |
| Displacement (cu.in.) | Mean | 307.1 | 132.5 |
>
Many thanks
The output is a data.frame
with etable
class, thus we can create the column using any method that can work with data.frame
library(dplyr)
out <- mtcars %>%
tab_cells(cyl, disp) %>%
tab_cols(vs) %>%
tab_stat_mean() %>%
tab_pivot() %>%
mutate(diff = `vs|0`- `vs|1`)
-output
> out
| | | vs | | diff |
| | | 0 | 1 | |
| ---- | ---- | ----- | ----- | ----- |
| cyl | Mean | 7.4 | 4.6 | 2.9 |
| disp | Mean | 307.1 | 132.5 | 174.7 |
| | | | | |
| | | | | |
> str(out)
Classes ‘etable’ and 'data.frame': 4 obs. of 4 variables:
$ row_labels: chr "cyl|Mean" "disp|Mean" NA NA
$ vs|0 : num 7.44 307.15 NA NA
$ vs|1 : num 4.57 132.46 NA NA
$ diff : num 2.87 174.69 NA NA
With the table output showed, it would be
mtcars %>%
tab_cells(cyl, disp) %>%
tab_cols(vs) %>%
tab_stat_mean() %>%
tab_pivot() %>% mutate(diff = `Engine|V-engine` - `Engine|Straight engine`)
| | | Engine | | diff |
| | | V-engine | Straight engine | |
| --------------------- | ---- | -------- | --------------- | ----- |
| Number of cylinders | Mean | 7.4 | 4.6 | 2.9 |
| Displacement (cu.in.) | Mean | 307.1 | 132.5 | 174.7 |
| | | | | |
| | | | | |
If there are multiple groups of columns, then either we can use across
or use map
to apply the condition
library(stringr)
library(purrr)
out1 <- mtcars %>%
tab_cells(cyl, disp) %>%
tab_cols(vs, am) %>%
tab_stat_mean() %>%
tab_pivot()
out2 <- bind_cols(out1, map_dfc(split.default(as.data.frame(out1[,-1]),
str_replace(names(out1)[-1], "\\|.*", "|diff")), reduce, `-`))
-output
> out2
| | | Engine | | Transmission | | Engine | Transmission |
| | | V-engine | Straight engine | Automatic | Manual | diff | diff |
| --------------------- | ---- | -------- | --------------- | ------------ | ------ | ------ | ------------ |
| Number of cylinders | Mean | 7.4 | 4.6 | 6.9 | 5.1 | 2.9 | 1.9 |
| Displacement (cu.in.) | Mean | 307.1 | 132.5 | 290.4 | 143.5 | 174.7 | 146.8 |
> str(out2)
Classes ‘etable’ and 'data.frame': 2 obs. of 7 variables:
$ row_labels : chr "Number of cylinders|Mean" "Displacement (cu.in.)|Mean"
$ Engine|V-engine : num 7.44 307.15
$ Engine|Straight engine: num 4.57 132.46
$ Transmission|Automatic: num 6.95 290.38
$ Transmission|Manual : num 5.08 143.53
$ Engine|diff : num 2.87 174.69
$ Transmission|diff : num 1.87 146.85