I am currently working in a project where I am trying to calculate the age of the queen in multiple hives based on the observation conducted by the beekeeper. They usually mark the queen with a colour and if/when the queen dies they marked her again. So in my table you can see the form of data like this
Colony month Queen_status
11 24-02 Not seen
11 24-03 Queen marked
11 24-04 Queen marked
11 24-05 Queen marked
11 24-06 Queen marked
12 24-02 Queen marked
12 24-03 Queen marked
12 24-04 New queen marked the day of the visite
12 24-05 Queen marked
12 24-06 Queen marked
Based on the color on the back of the bee, we can estimate the "initial age" at teh begging of teh study. I want to create a value called "age of queen" that increase every month by 1 but if queen_status is "New queen marked the day of the visite" it has to reset back to zero. This is the table I imagine having:
Colony month Queen_status Initale age of queen Relatif_age Actual age
11 24-02 Queen marked 25 1 26
11 24-03 Queen marked 25 2 27
11 24-04 Queen marked 25 3 28
11 24-05 Queen marked 25 4 29
11 24-06 Queen marked 25 5 30
12 24-02 Queen marked 30 1 31
12 24-03 Queen marked 30 2 32
12 24-04 New queen 30 1 1
12 24-05 Queen marked 30 2 2
12 24-06 Queen marked 30 3 3
I can't make it work. My last effort was the following
df <- df %>%
mutate(Colony = as.factor(Colony)) %>%
group_by(Colony) %>%
summarize(
month = month,
Queen status = Queen status,
Relative age = ifelse(month == min(month) & Queen status != "New queen", 1, with(df, ave(Queen status, cumsum(Statut_de_la_reine == Queen status "), FUN = seq_along))))
I also tried with a case_when but I couldn't make it work
Ideally I want just the final column in my dataframe.
That's pretty easy with package data.table:
library(data.table)
DT <- fread('Colony month Queen_status "Initale age of queen" Relatif_age "Actual age"
11 24-02 "Queen marked" 25 1 26
11 24-03 "Queen marked" 25 2 27
11 24-04 "Queen marked" 25 3 28
11 24-05 "Queen marked" 25 4 29
11 24-06 "Queen marked" 25 5 30
12 24-02 "Queen marked" 30 1 31
12 24-03 "Queen marked" 30 2 32
12 24-04 "New queen" 30 1 1
12 24-05 "Queen marked" 30 2 2
12 24-06 "Queen marked" 30 3 3')
setDT(DT)
DT[, Relatif_age_1 := rowid(Colony, cumsum(Queen_status == "New queen"))]
DT[, Actual_age_1 := fifelse(cumsum(Queen_status == "New queen") == 0,
`Initale age of queen`,
0) + Relatif_age_1, by = Colony]
# Colony month Queen_status Initale age of queen Relatif_age Actual age Relatif_age_1 Actual_age_1
# <int> <char> <char> <int> <int> <int> <int> <num>
# 1: 11 24-02 Queen marked 25 1 26 1 26
# 2: 11 24-03 Queen marked 25 2 27 2 27
# 3: 11 24-04 Queen marked 25 3 28 3 28
# 4: 11 24-05 Queen marked 25 4 29 4 29
# 5: 11 24-06 Queen marked 25 5 30 5 30
# 6: 12 24-02 Queen marked 30 1 31 1 31
# 7: 12 24-03 Queen marked 30 2 32 2 32
# 8: 12 24-04 New queen 30 1 1 1 1
# 9: 12 24-05 Queen marked 30 2 2 2 2
# 10: 12 24-06 Queen marked 30 3 3 3 3