I want to create a new column with a running sum of every value greater than 0. I have a dataframe:
and I want to create:
So far I have tried:
df$temp.sum <- if_else(df$air.temp > 0, cumsum(df$air.temp), 0)
Which resulted in
How do I not count values at or below 0, without changing the running sum? My dataset is 100,000+ observations, so simple suggestions are helpful!
Use a parallel maximum to make negative values 0, then continue to do the cumulative sum.
cumsum(pmax(df$air.temp, 0))
#[1] 1 3 3 6 7 7
Seems very quick on 1.2M values:
x <- rep(df$air.temp, 2e5)
system.time(cumsum(pmax(x, 0)))
## user system elapsed
## 0 0 0