I am trying to generate a heatmap for the evaluation of an exercise on process scheduling. The rows (y-axis) correspond to processes and the columns (x-axis) to the current time step.
The data is available to me as a matrix corresponding to what I described above, since geom_tile expects a data frame, I use reshape::melt
to transform the data into a data frame with one column denoting the process, one the current timestep and one the value (see below).
| Process | Timestep | value |
| ---------- | --------- | ----- |
| Scheduler | 1 | 1 |
| 4 | 1 | 13 |
| 3 | 1 | 13 |
| 2 | 1 | 13 |
| 1 | 1 | 26 |
| Scheduler | 2 | 10 |
| 4 | 2 | 24 |
| 3 | 2 | 84 |
| 2 | 2 | 19 |
| 1 | 2 | 314 |
[...]
Without any interference, the type of df$Timestep
is integer
.
Passing the data frame like this to ggplot, yields the following result:
As you can see, it looks like ggplot adds a column for t=0 and t=32, so before and after the actual first and last timestep:
This is precisely the problem I would like to tackle.
If I change the type of the "Timestep" column in the data frame returned by reshape::melt
to character
by setting df$Timestep <- as.character(df$Timestep)
ggplot gets confused about where to place the columns (seems to not consider natural alphabetical order), making the plot useless.
By prepending "t" to the timesteps, turning them into "t1","t2",... ggplot seems to be able to still understand the natural order of the axis values, but this makes the axis look messy (Timestep and t1,...) and thus not a desireable solution either.
This is the code used for drawing the plots that you can see in the images within this post:
drawHeatmap <- function(dataToDraw, solutionToHighlight, plotTitle) {
rownames(dataToDraw) <- as.character(c("Scheduler", 4:1))
# Used to prepend a character and enfore df column type that way
# colnames(dataToDraw) <- paste("t", 1:ncol(dataToDraw), sep="")
colnames(dataToDraw) <- c(1:ncol(dataToDraw)
df <- melt(dataToDraw)
colnames(df) <- c("Process","Timestep","value")
# Modify df types as needed
# Changing df column type here makes ggplot not know how to deal with the natural ordering
# df$Timestep <- as.character(df$Timestep)
df$value <- as.integer(df$value) # Convert in order to use continuous scale
solution_path <- highlightCorrectPath(t(solutionToHighlight))
ggplot() +
coord_fixed() +
scale_fill_gradientn(name = "Occurences", colors = c("blue", "green", "yellow", "orange", "red")) +
geom_tile(data = df, aes(x=Timestep, y=Process, fill = value)) +
scale_color_manual(values=c("black"), labels = c("Solution cells")) +
geom_rect(aes(xmin = solution_path[,"x_left"],
xmax = solution_path[,"x_right"],
ymin = solution_path[,"y_bottom"],
ymax = solution_path[,"y_top"], color = "Solution Cells"),
fill=NA, size=1.1) +
labs(title=plotTitle) +
theme(legend.position = "bottom")
}
Adding df$Timestep <- factor(df$Timestep)
fixed the issue, thanks a lot @stefan for suggesting this in the comments. However, I am still unsure what exactly caused ggplot to behave this way.
For the full code and resulting plot see below.
drawHeatmap <- function(dataToDraw, solutionToHighlight, plotTitle) {
rownames(dataToDraw) <- as.character(c( "Dispatcher", "Scheduler", 3:1))
colnames(dataToDraw) <- c(1:ncol(dataToDraw))
df <- melt(dataToDraw)
colnames(df) <- c("Process","Timestep","value")
# Modify df types as needed
df$value <- factor(df$value)
df$Timestep <- factor(df$Timestep)
plot_colors <- colorRampPalette(c("blue", "green", "yellow", "orange", "red"))(nlevels(df$value))
plot_breaks <- seq(from = levels(df$value)[1], to = levels(df$value)[nlevels(df$value)], by = 10)
solution_path <- highlightCorrectPath(t(solutionToHighlight))
ggplot() +
coord_fixed() +
scale_fill_discrete(name = "Occurences", type = plot_colors, breaks = plot_breaks) +
geom_tile(data = df, aes(x=Timestep, y=Process, fill = value)) +
scale_color_manual(values=c("black"), labels = c("Solution cells")) +
geom_rect(aes(xmin = solution_path[,"x_left"],
xmax = solution_path[,"x_right"],
ymin = solution_path[,"y_bottom"],
ymax = solution_path[,"y_top"], color = "Solution Cells"),
fill=NA, size=1.1) +
labs(title=plotTitle) +
theme(legend.position = "bottom")
}