I am new at R and I do not really know how to count cumulative number of occurrences of a row if it is the same ID but different date.Also, if the first date appears more than once, it should not count in the first attempt.Look at the rows 2 and 3, that is why the column I am trying to build is called "count" and equals to zero. "count" represents the times that the id has appeared but it follows the next two rules:
1.If it is the first appearence of the Id ever, it assigns zero. For example, for Id 1 despite there are two different registers for the very first date of this Id, "count" equals zero.
2.For the next Id 1 appereance (row 3 of Id1 ) it has a different date, that is why "count" equals 2. Because the 01/01/2018 Id1 appeared twice.
This is the DT and the result that I am looking for is in the "count" column:
We can use dplyr
to group_by
id
and return 0 for the first date
else return the count of occurrence of 1st date
library(dplyr)
df %>%
group_by(id) %>%
mutate(count = ifelse(date == date[1L], 0, sum(date == date[1L])))
# id date count
# <dbl> <fct> <dbl>
#1 1 01/01/2018 0
#2 1 01/01/2018 0
#3 1 02/01/2018 2
#4 2 03/02/2018 0
#5 2 04/04/2018 1
Or using base R ave
with similar logic
as.numeric(with(df, ave(date, id, FUN = function(x)
ifelse(x == x[1L], 0, sum(x == x[1L])))))
#[1] 0 0 2 0 1
data
df <- data.frame(id = c(1, 1, 1, 2, 2), date = c("01/01/2018", "01/01/2018",
"02/01/2018", "03/02/2018", "04/04/2018"), stringsAsFactors = FALSE)