rif-statementcasecumsum

Create a variable that increases by one in every row but resets based on condition


I am currently working in a project where I am trying to calculate the age of the queen in multiple hives based on the observation conducted by the beekeeper. They usually mark the queen with a colour and if/when the queen dies they marked her again. So in my table you can see the form of data like this

Colony  month   Queen_status

11  24-02   Not seen
11  24-03   Queen marked
11  24-04   Queen marked
11  24-05   Queen marked
11  24-06   Queen marked

12  24-02   Queen marked
12  24-03   Queen marked
12  24-04   New queen marked the day of the visite
12  24-05   Queen marked
12  24-06   Queen marked

Based on the color on the back of the bee, we can estimate the "initial age" at teh begging of teh study. I want to create a value called "age of queen" that increase every month by 1 but if queen_status is "New queen marked the day of the visite" it has to reset back to zero. This is the table I imagine having:

Colony  month   Queen_status        Initale age of queen    Relatif_age Actual age
11  24-02   Queen marked        25          1       26
11  24-03   Queen marked        25          2       27
11  24-04   Queen marked        25          3       28
11  24-05   Queen marked        25          4       29
11  24-06   Queen marked        25          5       30
12  24-02   Queen marked        30          1       31
12  24-03   Queen marked        30          2       32
12  24-04   New queen       30          1       1
12  24-05   Queen marked        30          2       2
12  24-06   Queen marked        30          3       3

I can't make it work. My last effort was the following

df <- df %>%
  mutate(Colony = as.factor(Colony)) %>%
  group_by(Colony) %>%
  summarize(
    month = month,
    Queen status = Queen status,

Relative age = ifelse(month == min(month) & Queen status != "New queen", 1, with(df, ave(Queen status, cumsum(Statut_de_la_reine == Queen status "), FUN = seq_along))))

I also tried with a case_when but I couldn't make it work

Ideally I want just the final column in my dataframe.


Solution

  • That's pretty easy with package data.table:

    library(data.table)
    DT <- fread('Colony  month   Queen_status        "Initale age of queen"    Relatif_age "Actual age"
    11  24-02   "Queen marked"        25          1       26
    11  24-03   "Queen marked"        25          2       27
    11  24-04   "Queen marked"        25          3       28
    11  24-05   "Queen marked"        25          4       29
    11  24-06   "Queen marked"        25          5       30
    12  24-02   "Queen marked"        30          1       31
    12  24-03   "Queen marked"        30          2       32
    12  24-04   "New queen"       30          1       1
    12  24-05   "Queen marked"        30          2       2
    12  24-06   "Queen marked"        30          3       3')
    
    setDT(DT) 
    
    DT[, Relatif_age_1 := rowid(Colony, cumsum(Queen_status == "New queen"))]
    DT[, Actual_age_1 := fifelse(cumsum(Queen_status == "New queen") == 0, 
                                 `Initale age of queen`, 
                                 0) + Relatif_age_1, by = Colony]
    #     Colony  month Queen_status Initale age of queen Relatif_age Actual age Relatif_age_1 Actual_age_1
    #      <int> <char>       <char>                <int>       <int>      <int>         <int>        <num>
    #  1:     11  24-02 Queen marked                   25           1         26             1           26
    #  2:     11  24-03 Queen marked                   25           2         27             2           27
    #  3:     11  24-04 Queen marked                   25           3         28             3           28
    #  4:     11  24-05 Queen marked                   25           4         29             4           29
    #  5:     11  24-06 Queen marked                   25           5         30             5           30
    #  6:     12  24-02 Queen marked                   30           1         31             1           31
    #  7:     12  24-03 Queen marked                   30           2         32             2           32
    #  8:     12  24-04    New queen                   30           1          1             1            1
    #  9:     12  24-05 Queen marked                   30           2          2             2            2
    # 10:     12  24-06 Queen marked                   30           3          3             3            3