Suppose I have data as follows:
tibble(
A = c(1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5),
B = c(1, 1, 2, 1, 2, 3, 1, 2, 1, 1, 1, 2, 3, 4, 1, 1),
)
i.e.,
# A tibble: 16 x 2
A B
<dbl> <dbl>
1 1 1
2 2 1
3 2 2
4 2 1
5 2 2
6 2 3
7 3 1
8 3 2
9 3 1
10 3 1
11 4 1
12 4 2
13 4 3
14 4 4
15 4 1
16 5 1
How do I create a sub_id each time a new sequence begins within the group defined by variable A, i.e.,
tibble(
A = c(1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5),
B = c(1, 1, 2, 1, 2, 3, 1, 2, 1, 1, 1, 2, 3, 4, 1, 1),
sub_id = c(1, 1, 1, 2, 2, 2, 1, 1, 2, 3, 1, 1, 1, 1, 2, 1)
)
# A tibble: 16 x 3
A B sub_id
<dbl> <dbl> <dbl>
1 1 1 1
2 2 1 1
3 2 2 1
4 2 1 2
5 2 2 2
6 2 3 2
7 3 1 1
8 3 2 1
9 3 1 2
10 3 1 3
11 4 1 1
12 4 2 1
13 4 3 1
14 4 4 1
15 4 1 2
16 5 1 1
Hopefully that’s well defined. I suppose I’m after a kind of inverse to row_number
Thanks in advance,
James.
Using base R
df$sub_id <- with(df, ave(B ==1, A, FUN = cumsum))