plottreedata-visualizationpartycode-visualization

Represent more than 20 levels in a glmtree


Currently I am working with the glmtree() function in R. I have some factor variables with 20+ levels. The problem comes with the representation of the tree. There is some information at certain leafs that is impossible to visualise due to the large amount of levels in certain variables (i.e. i_mode has 29 levels).

One possible solution would be to "dummify" those levels. However, I'd rather not do it, if possible at all.

Do you know a method in which I can represent the same plot in a more readable form?

Any clue?

Thank you

Recursive partitioned tree


Solution

  • My feeling is that it will be challenging to understand such a plot, also beyond the labeling issue. Personally, I would try to break down such a factor into more intelligible groups with fewer levels (not necessarily binary, though).

    Having said that, the panel function edge_simple() that draws the edge labels in the tree has some arguments that can help improve the readability, e.g., you can alternate their position and change the font size. For a worked example see: R partykit::ctree offset labels on edges Additionally you could try abbreviating the factor levels prior to learning the tree. However, with 29 levels all of this will probably not help much, I'm afraid.