The package partykit
offers a plotting function for decision trees plot.constparty()
, which can display distributions of the terminal node with boxplots (node_boxplot()
), minimal example using the iris dataset below.
library("partykit")
ct <- ctree(Petal.Length ~ Sepal.Length + Sepal.Width, data = iris, stump = TRUE)
plot(ct, terminal_panel = node_boxplot)
I would love to display the boxplots as violin plots. Since you can write your own panel functions, that should actually be possible. However, it seems that the violin plot needs to be setup using grid
functions, so I have no clue how to do that. I imagine that this is quite cumbersome work, but I believe that many users would benefit from this panel function. Any suggestions on how to implement that? (A first lead points here: partykit: Change terminal node boxplots to bar graphs that shows mean and standard deviation)
Add on: Assume we had a strategy to plot terminal nodes with violins. How could we apply this strategy to multivariate responses to display violins instead of boxplots. See the following screenshot produced with the function node_mvar()
:
There are two natural strategies for this:
node_violinplot()
panel-generating function similar to node_boxplot()
.ggplot2
via the ggparty
package and leverage the existing geom_violin()
.For the first strategy, I would recommend to copy the code of node_boxplot()
(including setting its class!) and rename it to, say node_violinplot()
. Most of its code is responsible for setting up the right viewport and axis ranges etc. which can all be preserved. And then one would "only" replace the grid.lines()
and grid.rect()
for drawing the boxes with the calls for drawing the violin. I'm not sure what would be the best way to compute the coordinates for the violin elements, though.
For the second strategy all building blocks are essentially available and just have to be customized to obtain the kind of violinplot that you would want. Fox example:
This plot can be replicated as follows:
## example tree
library("partykit")
ct <- ctree(dist ~ speed, data = cars)
## visualization with ggparty + geom_violin
library("ggparty")
ggparty(ct) +
geom_edge() +
geom_edge_label() +
geom_node_splitvar() +
geom_node_plot(gglist = list(
geom_violin(aes(x = "", y = dist)),
geom_boxplot(aes(x = "", y = dist), coef = Inf, width = 0.1, fill = "lightgray"),
xlab(""),
theme_minimal()
))