I have a data frame that looks like this below. I need to sum the number of correct trials by participant, and reset the counter when it gets to a 0.
Participant TrialNumber Correct
118 1 1
118 2 1
118 3 1
118 4 1
118 5 1
118 6 1
118 7 1
118 8 0
118 9 1
118 10 1
120 1 1
120 2 1
120 3 1
120 4 1
120 5 0
120 6 1
120 7 0
120 8 1
120 9 1
120 10 1
I've tried using splitstackshape
:
df$Count <- getanID(cbind(df$Participant, cumsum(df$Correct)))[,.id]
But it cumulatively sums the correct trials when it gets to a 0 and not by participant:
Participant TrialNumber Correct Count
118 1 1 1
118 2 1 1
118 3 1 1
118 4 1 1
118 5 1 1
118 6 1 1
118 7 1 1
118 8 0 2
118 9 1 1
118 10 1 1
120 1 1 1
120 2 1 1
120 3 1 1
120 4 1 1
120 5 0 2
120 6 1 1
120 7 0 2
120 8 1 1
120 9 1 1
120 10 1 1
I then tried using dplyr
:
df %>%
group_by(Participant) %>%
mutate(Count=cumsum(Correct)) %>%
ungroup %>%
as.data.frame(df)
Participant TrialNumber Correct Count
118 1 1 1
118 2 1 2
118 3 1 3
118 4 1 4
118 5 1 5
118 6 1 6
118 7 1 7
118 8 0 7
118 9 1 8
118 10 1 9
120 1 1 1
120 2 1 2
120 3 1 3
120 4 1 4
120 5 0 4
120 6 1 5
120 7 0 5
120 8 1 6
120 9 1 7
120 10 1 8
Which gets me closer, but still doesn't reset the counter when it gets to 0. If anyone has any suggestions to do this it would be greatly appreciated, thank you
Does this work?
library(dplyr)
library(data.table)
df %>%
mutate(grp = rleid(Correct)) %>%
group_by(Participant, grp) %>%
mutate(Count = cumsum(Correct)) %>%
select(- grp)
# A tibble: 10 x 4
# Groups: Participant, grp [6]
grp Participant Correct Count
<int> <chr> <dbl> <dbl>
1 1 A 1 1
2 1 A 1 2
3 1 A 1 3
4 2 A 0 0
5 3 A 1 1
6 3 B 1 1
7 3 B 1 2
8 4 B 0 0
9 5 B 1 1
10 5 B 1 2
Toy data:
df <- data.frame(
Participant = c(rep("A", 5), rep("B", 5)),
Correct = c(1,1,1,0,1,1,1,0,1,1)
)