I would like to construct cumulative count for two groups and reweight it to level 1.
I know how to plot density in this case:
my_df = data.frame(col_1 = sample(c(0,1), 1000, replace = TRUE),
col_2 = sample(seq(1,100,by=1), 1000, replace = TRUE))
my_df$col_1 <- as.factor(my_df$col_1)
ggplot(data = my_df, aes(x = col_2, group = col_1, col = col_1))+
geom_density(position = 'dodge', size = 2)
However, if I try to plot cumsum
I am getting the following picture:
As you can see second line starts where first line ends level.
Is there fix for it? I can always try to do computations manually and plot it, but I wonder if there is ggplot solution? There are some solutions that I found on SO, but they do not involve scaling data to level 1.
By right, this has already been answered by @Roman in the comments but just to make it clear and expand on that answer a little.
Create the data:
my_df = data.frame(col_1 = sample(c(0,1), 1000, replace = TRUE),
col_2 = sample(seq(1,100,by=1), 1000, replace = TRUE))
my_df$col_1 <- as.factor(my_df$col_1)
We can get cumulative counts using the stat_ecdf()
function and to split this by group, simply use the aes
variable group
. Putting this together we get:
ggplot(data = my_df, aes(x = col_2,
group = col_1,
col = col_1)
) +stat_ecdf()
You can also change from the line to something akin to a distribution (filled area) by using geom="ribbon"
and referencing the y cumulative value with:
ggplot(data = my_df, aes(x = col_2,
group = col_1,
fill = col_1)
) +stat_ecdf(aes(ymin=0, ymax=after_stat(y)),
geom="ribbon",alpha=0.2)
More on this in another SO thread