I've got a data frame that looks something like this:
precinct, race, age, people
1001, black, 18-40, 1
1001, white, 18-40, 2
1001, hispanic, 18-40, 3
1001, asian, 18-40, 4
1001, black, 40 or older, 5
1001, white, 40 or older, 6
1001, hispanic, 40 or older, 7
1001, asian, 40 or older, 8
I want to make it look like this:
precinct, black, white, hispanic, asian, 18-40, 40 or older
1001, 6, 8, 10, 12, 10, 26
I've used dcast
dcast(
data = mydataframe,
formula = Precinct ~ race + age,
fun.aggregate = sum,
value.var = 'people'
)
but this does not produce my desired result.
When we create formula with +
on the rhs of ~
it creates the combinations between those columns instead of having every single unique element from those columns. In order to have the latter, we may need to melt
to long format and then use dcast
on the single column (assuming those columns are of the same type)
library(data.table)
dcast(melt(setDT(mydataframe), id.var = c('precinct', 'people')),
precinct ~ value, fun.aggregate = sum, value.var = 'people')
-output
Key: <precinct>
precinct 18-40 40 or older asian black hispanic white
<int> <int> <int> <int> <int> <int> <int>
1: 1001 10 26 12 6 10 8
library(dplyr)
library(tidyr)
mydataframe %>%
pivot_longer(cols = c(race, age), names_to = NULL) %>%
pivot_wider(names_from = value, values_from = people, values_fn = sum)
-output
# A tibble: 1 × 7
precinct black `18-40` white hispanic asian `40 or older`
<int> <int> <int> <int> <int> <int> <int>
1 1001 6 10 8 10 12 26
mydataframe <- structure(list(precinct = c(1001L, 1001L, 1001L, 1001L, 1001L,
1001L, 1001L, 1001L), race = c("black", "white", "hispanic",
"asian", "black", "white", "hispanic", "asian"), age = c("18-40",
"18-40", "18-40", "18-40", "40 or older", "40 or older", "40 or older",
"40 or older"), people = 1:8), row.names = c(NA, -8L),
class = "data.frame")