rggplot2cumulative-frequency

ggplot cumulative frequency with groups


I would like to construct cumulative count for two groups and reweight it to level 1.

I know how to plot density in this case:

my_df = data.frame(col_1 = sample(c(0,1), 1000, replace = TRUE),
                   col_2 = sample(seq(1,100,by=1), 1000, replace = TRUE))
my_df$col_1 <- as.factor(my_df$col_1)
ggplot(data = my_df, aes(x = col_2, group = col_1, col = col_1))+
  geom_density(position = 'dodge', size = 2)

Here is the plot: pdf

However, if I try to plot cumsum I am getting the following picture:cdf

As you can see second line starts where first line ends level.

Is there fix for it? I can always try to do computations manually and plot it, but I wonder if there is ggplot solution? There are some solutions that I found on SO, but they do not involve scaling data to level 1.


Solution

  • By right, this has already been answered by @Roman in the comments but just to make it clear and expand on that answer a little.

    Create the data:

    my_df = data.frame(col_1 = sample(c(0,1), 1000, replace = TRUE),
                       col_2 = sample(seq(1,100,by=1), 1000, replace = TRUE))
    my_df$col_1 <- as.factor(my_df$col_1)
    

    We can get cumulative counts using the stat_ecdf() function and to split this by group, simply use the aes variable group. Putting this together we get:

    ggplot(data = my_df, aes(x = col_2, 
                             group = col_1, 
                             col = col_1)
           ) +stat_ecdf()
    

    Using stat_ecdf and group

    You can also change from the line to something akin to a distribution (filled area) by using geom="ribbon" and referencing the y cumulative value with:

    ggplot(data = my_df, aes(x = col_2, 
                             group = col_1, 
                             fill = col_1)
           ) +stat_ecdf(aes(ymin=0, ymax=after_stat(y)),
                        geom="ribbon",alpha=0.2)
    

    Using stat_ecdf and group filled

    More on this in another SO thread