I have the following situation:
V1 | V2 |
---|---|
A | A1 |
A | A1 |
A | A1 |
A | A2 |
A | A2 |
A | A3 |
B | B1 |
B | B2 |
B | B2 |
and i need to group by V1, and summarise counting how many distinct groups each V1 level has in V2. Something like this:
V1 | n |
---|---|
A | 3 |
B | 2 |
How can i use dplyr funcitons to solve that?
Thanks!!
We can use rle
after grouping by 'V1'
library(dplyr)
df1 %>%
group_by(V1) %>%
summarise(n = length(rle(V2)$values), .groups = 'drop')
-output
# A tibble: 2 × 2
V1 n
<chr> <int>
1 A 3
2 B 2
Or with rleid
and n_distinct
library(data.table)
df1 %>%
group_by(V1) %>%
summarise(n = n_distinct(rleid(V2)))
# A tibble: 2 × 2
V1 n
<chr> <int>
1 A 3
2 B 2
df1 <- structure(list(V1 = c("A", "A", "A", "A", "A", "A", "B", "B",
"B"), V2 = c("A1", "A1", "A1", "A2", "A2", "A1", "B1", "B2",
"B2")), class = "data.frame", row.names = c(NA, -9L))