I have the following data frame:
df <- data.frame(NR_HH = c('HH1','HH1','HH1','HH1','HH2','HH2'), ID = c(11,12,13,14,21,22), Age = c(28,25,16,4,45,70), Fem_Adult = c('FALSE','TRUE','FALSE','FALSE', 'TRUE','TRUE'),Male_Adult = c('TRUE','FALSE','FALSE','FALSE', 'FALSE','FALSE'), School_Child = c('FALSE','FALSE','TRUE','FALSE', 'FALSE','FALSE'), Preschool_Child = c('FALSE','FALSE','FALSE','TRUE', 'FALSE','FALSE'))
# NR_HH ID Age Fem_Adult Male_Adult School_Child Preschool_Child
#1 HH1 11 28 FALSE TRUE FALSE FALSE
#2 HH1 12 25 TRUE FALSE FALSE FALSE
#3 HH1 13 16 FALSE FALSE TRUE FALSE
#4 HH1 14 4 FALSE FALSE FALSE TRUE
#5 HH2 21 45 TRUE FALSE FALSE FALSE
#6 HH2 22 70 TRUE FALSE FALSE FALSE
I want to group this data by NR_HH and build a new data frame that shows the number of female adults, male adults, school age children and preschool age children in each household. I want to get something like this:
# NR_HH Fem_Adult Male_Adult School_Child Preschool_Child
#1 HH1 1 1 1 1
#2 HH2 2 0 0 0
I tried the following code:
df_summary =df%>%group_by(NR_HH)%>%summarise_if(is.logical, sum)
But I get this error:
Error: Can't create call to non-callable object
The issue is with the column types. These are factor
columns creating by quoting the 'TRUE'/'FALSE'
which results in character
type. But, the data.frame
call by default use stringsAsFactors = TRUE
. Therefore, we get factor
class for these columns. This could been avoid by simply unquoting the TRUE/FALSE
input. Assuming that the input is already quoted, then convert it to logical
with as.logical
and then get the sum
after grouping by 'NR_HH'
df %>%
mutate_at(4:7, as.logical) %>%
group_by(NR_HH) %>%
summarise_if(is.logical, sum)
# A tibble: 2 x 5
# NR_HH Fem_Adult Male_Adult School_Child Preschool_Child
# <fct> <int> <int> <int> <int>
# 1 HH1 1 1 1 1
# 2 HH2 2 0 0 0