The image is of a simple parallel coordinates plot of 9 records from the iris
dataset, created in ggplot2
and flipped to plotly via ggplotly
. The plot is colour
ed by Species and group
ed by an observation ID (so that each observation has its own line series), and the custom tooltip text
attribute contains this ID (but in reality could contain all sorts of information).
In "Show closest data on hover" mode, all 9 tooltips are available for individual selection. But in "Compare data on hover" mode (illustrated), only one point per colour is shown with a tooltip, and specifically always the last observation in each colour group (3, 6 and 9).
My expectation/experience is that "Compare data on hover" mode displays the tooltips for all data points in the plot that share an x coordinate (and this is the behaviour I want). However, this expectation is clearly wrong in this case. I surmise that this has to do with the presence of two "grouping" aesthetics (colour
and group
) in the ggplot2
call, and their translation by ggplotly
into a plotly object, but I don't have the knowledge to go further and searching has drawn a blank.
Code to reproduce this example is below. I'd be grateful for any explanation of the observed behaviour, and ideally a solution or workaround for generating the desired behaviour instead.
# data: first 3 rows of each iris species, add observation ID, reshape to long
library(data.table)
df <- data.table(iris)[, head(.SD, 3), by=Species][, ID := seq(.N)] |> melt(id.vars=c("ID", "Species")) |> as.data.frame()
# manual parallel coordinates plot in ggplot2
library(ggplot2)
gg <- ggplot(df, aes(x=variable, y=value, colour=Species, group=ID)) +
geom_point(aes(text=ID)) + geom_line()
# flip to plotly
library(plotly)
gg |> ggplotly(tooltip="text")
TL;DR
It turns out that a ggplotly()
plotly chart only ever shows one tooltip per colour in "Compare data on hover" mode, since it is the colour
aesthetic that determines the number of traces, and at most one tooltip is shown per trace in this mode. I explain this regular behaviour below (which I struggled to find documented in one place) and show a workaround for the case in the question.
Explanation of the observed behaviour
After experimenting with various ggplot()
calls and inspecting the resulting plotly object visually and with plotly_json()
, the behaviour is consistent with the following explanation of a) how ggplotly
translates ggplot2
aesthetics and b) how plotly objects display tooltips:
colour
aesthetic translates to a distinct plotly
trace per colourgroup
aesthetic does not translate to distinct traces, but
instead, if the trace is a line (type: 'scatter', mode: 'lines'), a (NULL,
NULL) coordinate is inserted between groups in order to prevent a
line segment from connecting data from different groupsFor a very simple illustration of (3), consider this scatterplot of four points with IDs A,B,C,E, where B and C share the same x-value of 2. There is only one trace and "Compare data on hover" at x=2 displays only the tooltip for point C.
df <- data.frame(x = c(1,2,2,3),
y = c(2,1,3,2),
ID = LETTERS[1:4])
(ggplot(df, aes(x=x, y=y)) + geom_point(aes(text=ID))) |> ggplotly(tooltip="text")
In the parallel-coordinates example used in the question:
geom_point()
in combination with the colour=Species
aesthetic results in 3
plotly marker traces (one per Species), each with 12 points (3 logical observations at 4 x-values)geom_line()
in combination with the colour=Species
and group=ID
aesthetics
results in 3 plotly line traces (one per Species), each with 14 points
(3 logical observations at 4 x-values, plus 2 null points separating the sets)text=ID
aesthetic attached to geom_point()
and referenced in the
ggplotly()
call results in the last point per Species showing a
tooltip in "Compare data on hover" mode against any particular x-coordinateIn other words, despite the appearance of 9 series in the plot (one for each logical observation), there are in fact only 3 (pairs of) traces, and only one tooltip per marker trace is shown using "Compare data on hover", consistent with the explanation arrived at above.
Workaround to achieve the desired behaviour
One workaround is therefore to give each logical observation its own trace by colour
ing by ID rather than Species. To maintain the visual colouring by species, we need to map the Species palette (n=3) to a palette applied to the distinct IDs (n=9). The result is shown at the top of this answer and the code is as follows:
# map species colours to IDs
species_pal <- scales::hue_pal()(length(unique(df$Species)))
ID_pal <- species_pal[as.numeric(unique(df[c("Species", "ID")])$Species)]
# generate the plot
library(ggplot2)
library(plotly)
( ggplot(df, aes(
x = variable,
y = value,
colour = as.factor(ID), # now colouring by observation ID
group = ID
)) +
geom_point(aes(text = ID)) +
geom_line() +
scale_colour_manual(values = ID_pal) + # species colours mapped to IDs
theme(legend.position = "none") # suppress legend :(
) |> ggplotly(tooltip = "text")
It isn't a perfect workaround because we have to lose the legend (which would display 9 keys if shown), but it does ensure that all tooltips are visible in "Compare data on hover" mode. (In the use case that sparked the question, which is EDA of multivariate datasets, the grouping variable is always based on statistical clustering, so the legend does not provide any extra semantic information anyway. The interesting information is the identity and attributes of each observation, contained in the tooltips.)