What is the transformation that using stat = 'bin'
does in geom_point
ggplot(mpg, aes(x = displ)) + geom_point(color = 'red', stat = 'bin') +
geom_text(stat = 'bin',aes(label = stat(count)))
What are these values?
If I use stat = 'count'
I understand the result but I dont understand when Im using stat = 'bin'
stat_bin
counts binned (grouped) values; stat_count
counts actual values.
stat_bin
is effectively the peaks of each in a histogram, which is binning the data and then counting the values in the bin.
Compare the two plots:
ggplot(mpg, aes(x = displ)) + geom_histogram(color = 'red')
ggplot(mpg, aes(x = displ)) + geom_point(color = 'red', stat = 'bin') + geom_text(stat = 'bin',aes(label = stat(count)))
Notice how each of the values in your stat='bin'
plot marries up with the peaks of the histogram.
Conversely, stat='count'
is effectively just table(mpg$displ)
:
table(mpg$displ)
# 1.6 1.8 1.9 2 2.2 2.4 2.5 2.7 2.8 3 3.1 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 4.2 4.4 4.6 4.7 5 5.2 5.3 5.4 5.6 5.7 5.9 6 6.1 6.2 6.5 7
# 5 14 3 21 6 13 20 8 10 8 6 9 4 5 2 3 8 3 15 4 1 11 17 2 5 6 8 1 8 2 1 1 2 1 1
ggplot(mpg, aes(x = displ)) + geom_point(color = 'red', stat = 'count') + geom_text(stat = 'count',aes(label = stat(count)))
Notice that the counts are the same.
Bottom line: counting raw data and counting binned data, that is the difference.