I would like to present the probabilities of some joint events as a raster using ggplot2
package and wonder how does geom_raster
decides which value to promote in case of more than one cell values. I have cases where these events can have more than one probabilities for some reasons. In the code below and the picture above, I illustrate the point of my question at coordinate (10, 10). Does geom_raster
considers the last value? Does it sample?
library(ggplot2)
# Normal raster
r <- data.frame(x = 1:10, y = rep(10, 10), value = 1:10)
p1 <- ggplot(r, aes(x, y, fill=value))+
geom_raster()+
coord_equal()+
theme(legend.position = 'bottom')+
labs(title = 'Normal raster: every cell has one value')
p1
# Assuming that coordinate (10, 10) have values 10 and 0
r <- rbind(r, c(10, 10, 0))
p2 <- ggplot(r, aes(x, y, fill=value))+
geom_raster()+
coord_equal()+
theme(legend.position = 'bottom')+
labs(title = 'Raster having 2 different values (10 then 0) at coordinates (10, 10)')
p2
It appears that just the last value for the cell is used. The logic can be found in the source code in the draw_panel function of GeomRaster. We see this code
x_pos <- as.integer((data$x - min(data$x))/resolution(data$x,
FALSE))
y_pos <- as.integer((data$y - min(data$y))/resolution(data$y,
FALSE))
nrow <- max(y_pos) + 1
ncol <- max(x_pos) + 1
raster <- matrix(NA_character_, nrow = nrow, ncol = ncol)
raster[cbind(nrow - y_pos, x_pos + 1)] <- alpha(data$fill,
data$alpha)
So what it does is makes a matrix with rows and columns for all the values, then it does an assignment using matrix indexing. When you do this, only the last assignment survives. For example
(m <- matrix(1:9, nrow=3))
# [,1] [,2] [,3]
# [1,] 1 4 7
# [2,] 2 5 8
# [3,] 3 6 9
(rowcols <- cbind(c(2,3,2), c(3,1,3)))
# [,1] [,2]
# [1,] 2 3
# [2,] 3 1
# [3,] 2 3
m[rowcols] <- 10:12
m
# [,1] [,2] [,3]
# [1,] 1 4 7
# [2,] 2 5 12
# [3,] 11 6 9
What we are doing is creating a matrix then changing the value of cell (2,3), (3,1) then (2,3) again. Only the last assignment to (2,3) is preserved (the 10 value is overwritten). So the value kept depends on the order your data is passed to the ggplot object.