After this post, where I saw how to do a clickable histogram, I was wondering if there is a way to use brushedPoints
in order to get the output from the brush. I saw that I need a x-axis and y-axis. However, since ggplot2
allows me to get a histogram (with the frequency axis) and a histogram with a density curve (with the density axis), I don't know how to get this information before the plot.
Does anyone know how to calculate the density and the frequency in order to draw histograms in ggplot2
? (NOTE that I don't want to use ggplot2 particular functions to get the plots, I want a data.frame with this information after drawing the plot).
The code that I use to draw a histogram with a density curve:
library(ggplot2)
library(dplyr)
val1 <- c(2.1490626,3.7928443,2.2035281,1.5927854,3.1399245,2.3967338,3.7915825,4.6691277,3.0727319,2.9230937,2.6239759,3.7664386,4.0160378,1.2500835,4.7648343,0.0000000,5.6740227,2.7510256,3.0709322,2.7998003,4.0809085,2.5178086,5.9713330,2.7779843,3.6724801,4.2648527,3.6841084,2.5597235,3.8477471,2.6587736,2.2742209,4.5862788,6.1989269,4.1167091,3.1769325,4.2404515,5.3627032,4.1576810,4.3387921,1.4024381,0.0000000,4.3999099,3.4381837,4.8269218,2.6308474,5.3481382,4.9549753,4.5389650,1.3002293,2.8648220,2.4015338,2.0962332,2.6774765,3.0581759,2.5786137,5.0539080,3.8545796,4.3429043,4.2233248,2.0434363,4.5980727)
val2 <- c(3.7691229,3.6478055,0.5435826,1.9665861,3.0802654,1.2248374,1.7311236,2.2492826,2.2365337,1.5726119,2.0147144,2.3550348,1.9527204,3.3689502,1.7847986,3.5901329,1.6833872,3.4240479,1.8372175,0.0000000,2.5701453,3.6551315,4.0327091,3.8781182)
df1 <- data.frame(value = val1)
df2 <- data.frame(value = val2)
data <- bind_rows(lst(df1, df2), .id = 'id')
data %>%
ggplot(aes(value)) +
geom_histogram(aes(y=..density.., fill = id), bins=10, col="black", alpha=0.4) +
geom_density(lwd = 1.2, colour = "red", show.legend = FALSE) +
facet_grid(id ~ .) +
scale_x_continuous(breaks=pretty(data$value, n=10)) +
ggtitle("My histogram....") +
guides(fill = guide_legend(title="My legend...")) +
theme(strip.text.x = element_blank(),strip.text.y = element_blank())
The code that I use to draw a histogram with frequency:
data %>%
ggplot(aes(value)) +
geom_histogram(fill="red", bins=10, col="black", alpha=0.4) +
facet_grid(id ~ .) +
scale_x_continuous(breaks=pretty(data$value, n=10)) +
ggtitle("My histogram....") +
guides(fill = guide_legend(title="My legend...")) +
theme(strip.text.x = element_blank(),strip.text.y = element_blank())
Once I have the density and frequency columns, I will have to delete those parameters from the code, but I don't know if it will be possible to use a "y" column with that information.
Thanks very much in advance
Regards
If you're looking to extract count / density information from the plot, layer_data
is your friend.
library(ggplot2)
library(dplyr)
p <- data %>%
ggplot(aes(value)) +
geom_histogram(fill="red", bins=10, col="black", alpha=0.4) +
facet_grid(id ~ .) +
scale_x_continuous(breaks=pretty(data$value, n=10)) +
ggtitle("My histogram....") +
guides(fill = guide_legend(title="My legend...")) +
theme(strip.text.x = element_blank(),strip.text.y = element_blank())
head(layer_data(p))
#> y count x xmin xmax density ncount ndensity
#> 1 2 2 0.0000000 -0.3443848 0.3443848 0.04760210 0.1333333 0.1333333
#> 2 0 0 0.6887697 0.3443848 1.0331545 0.00000000 0.0000000 0.0000000
#> 3 4 4 1.3775393 1.0331545 1.7219241 0.09520421 0.2666667 0.2666667
#> 4 7 7 2.0663090 1.7219241 2.4106938 0.16660737 0.4666667 0.4666667
#> 5 15 15 2.7550786 2.4106938 3.0994635 0.35701579 1.0000000 1.0000000
#> 6 6 6 3.4438483 3.0994635 3.7882331 0.14280631 0.4000000 0.4000000
#> flipped_aes PANEL group ymin ymax colour fill size linetype alpha
#> 1 FALSE 1 -1 0 2 black red 0.5 1 0.4
#> 2 FALSE 1 -1 0 0 black red 0.5 1 0.4
#> 3 FALSE 1 -1 0 4 black red 0.5 1 0.4
#> 4 FALSE 1 -1 0 7 black red 0.5 1 0.4
#> 5 FALSE 1 -1 0 15 black red 0.5 1 0.4
#> 6 FALSE 1 -1 0 6 black red 0.5 1 0.4
However, if your plan is to rather create your own bins, you need to manually cut and count. There are plenty of ways to do that, I'd suggest to use ggplot2's very own cut functions. You can label as you want, I have just added those labels for clarity.
## Creating your own histogram
## you need something like binwidth or cuts, I'd use it as a variable
## the {{}} (curly curly) operator is dplyr semantic
count_bins <- function(data, group, val, cuts, labels = seq_len(cuts)){
cuts <- cuts
data %>%
## you can also use base::cut or another ggplot2 cut_ function
mutate(cuts = ggplot2::cut_interval({{val}}, n = cuts, labels = labels)) %>%
group_by({{group}}) %>%
count(cuts)
}
count_bins(data, id, value, 10) %>%
ggplot(aes(cuts, n)) +
geom_col(fill="red", col="black", alpha=0.4) +
facet_grid(id ~ .)