rdataframedplyraggregatecrosstable

How calculate aggregate of some data in the columns of R


I have a DataSet which includes some women in childbearing ages and beyond. Each woman specified with an id and DataSet includes mothers' age. For each woman we have parity. Parity1 is the first child that women have. Values in parity refers to children's age. For example, a woman with id 1 is 38 years old, his first child is 15 years old, second 13, third 10 and fourth 0 years old at the time of census.

library("tidyverse")

sample_df <- tibble(
    id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
    AGE = c(38, 39, 40, 41, 42, 43, 44, 45, 46, 47),
    parity1 = c(15, 14, 13, 12, 9, 8, 14, 13, 3, 7),
    parity2 = c(13, 9, 9, 10, 7, 4, 13, 11, NA, 5),
    parity3 = c(10, 7, 3, 3, 6, 2, 9, 15, NA, 2),
    parity4 = c( 0, NA, NA, 1, NA, 0, 0 , 1, NA, NA),
)

long story short, I want to know how many children of each age (age of child), as a total, exist in my data set based on their mothers' age. For example how many 0-year-old children we have based on their mother's age. In the end, the code should terminate to a table in which for each woman, in childbearing age, the number of their children in ages 0, 1, 2, 3, 4, 5... to be specified. The final output is supposed to seem something like this:

AGE of mother 0 1 2 3
38 1
39
40 1
41 1 1
42
43 1 1 1
44 1
45 1
46 1
47 1

Seems cruel, but I rellay appreciate any help that you can provide. Thank you so much in advance.


Solution

  • after running the codes proposed by @Mark, try to run this code in your data. I think, this code can do the thing you mentioned above.

    sample_df<-pivot_longer(sample_df,starts_with("parity"), values_drop_na =TRUE) 
    
    sample_df<- pivot_wider(sample_df,names_from = value,names_glue = 
    "year{value}", values_from = name, names_sort = TRUE, values_fn = length, 
    values_fill = 0)
            
    child<-aggregate(cbind(sample_df$year0, 
    sample_df$year1,sample_df$year2,sample_df$year3,sample_df$year4,sample_df$year5,sample_df$year6,sample_df$year7,sample_df$year8,sample_df$year9,sample_df$year10,sample_df$year11,sample_df$year12,sample_df$year13,sample_df$year14), list(sample_df$age), sum)