I have a data frame. The snippet is:
df1 <- data.frame(x = c(1, 2, 1, 3, 5, 1, 4, 1), y = c(1, 1, 2, 2, 1, 1, 1, 3))
x y
1 1 1
2 2 1
3 1 2
4 3 2
5 5 1
6 1 1
7 4 1
8 1 3
I need to group df1 by y
and sum over x
but accounting for order of y.
I.e I need to create new group after each new y
and sum over corresponding x
.
The desired output is
x y
1 3 1
2 4 2
3 10 1
4 1 3
How to do this in R?
Using data.table
:
library(data.table)
setDT(df1)[, .(x=sum(x), y=y[1]), .(rleid(y))][,rleid:=NULL][]
#> x y
#> 1: 3 1
#> 2: 4 2
#> 3: 10 1
#> 4: 1 3
Or another dplyr
solution using base::rle
:
library(dplyr)
df1 %>%
group_by(y_grp = with(rle(y), rep(seq_along(lengths), lengths))) %>%
summarise(x = sum(x), y = y[1]) %>%
ungroup %>% select(-y_grp)
#> # A tibble: 4 x 2
#> x y
#> <dbl> <dbl>
#> 1 3 1
#> 2 4 2
#> 3 10 1
#> 4 1 3