rdplyrtimepercentage

Can I calculate the percentage of a value and summarize this information over every sampling-date?


Date Tree Compound Compound_mg %mg
13.1 a C21 5
13.1 b C21 4 x
13.1 c C21 9
20.2 a C21 6
20.2 b C21 5 x
20.2 c C21 10
13.1 a C23 6
13.1 b C23 6 x
13.1 c C23 10
20.2 a C23 5
20.2 b C23 4 x
20.2 c C23 9

This is how my data looks, except for the the last column, which I wish to get. I have sampled three trees (N=3) and have extracted compounds, several from each tree, but on different dates to see how the proportions change over time.

I would like to know, how many %mg is C21 of all the compounds, at each time-point. So what proportion does C21 make up of all compounds on 13.1 (averaged by 3 trees), and how much it makes up on 20.1 aso. Then the same for C23.

Example dataset:

Df <- data.frame(
  rating = 1:12,
  Date = c('13.1','13.1','13.1','20.2','20.2','20.2','13.1','13.1','13.1','20.2','20.2','20.2'),
  Tree = c('a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c'),
  Compound = c("C21","C21","C21","C21","C21","C21","C23","C23","C23","C23","C23","C23"),
  Compound_mg = c(5, 4, 9, 6, 5, 10, 6, 6, 10, 5, 4, 9)
)

I tried this...

Df2 <- Df %>%
group_by(Compound) %>%
mutate(perc = Compound(mg) / sum(Compound(mg))

But I don't know how to average the 3 trees (a,b, and c), or how to incorporate Date


Solution

  • In base R you could do the following:

    (prop <- proportions(xtabs(Compound_mg~Date+Compound,Df),1))
    
          Compound
    Date         C21       C23
      13.1 0.4500000 0.5500000
      20.2 0.5384615 0.4615385
    

    If you want a dataframe:

    as.data.frame(tab,responseName = 'perc')
    
      Date Compound      perc
    1 13.1      C21 0.4500000
    2 20.2      C21 0.5384615
    3 13.1      C23 0.5500000
    4 20.2      C23 0.4615385