Here's a problem: What's the best way to plot values for threefold combinations of a categorical variable?
This is as far as I got in R:
library(tidyverse)
library(ggtern)
df_person <- tibble( name = c( 'Alice', 'Bob', 'Carla', 'Dave', 'Eve' ) ) %>%
rowid_to_column( 'id_person' )
# generate all trios of persons (5 choose 3)
df <- df_person %>% select( name ) %>%
map_df( function(x) { combn(x, 3, paste, collapse = '_') } ) %>%
separate( name, c('person1', 'person2', 'person3') ) %>%
mutate_all(~ as.factor(.) )
# assign a value to each trio
df$val <- runif( nrow(df) )
# generate ticks and labels for axes
axis <- df_person %>% mutate( fct = as.factor(name) ) %>%
mutate( tick = as.numeric(fct) / 5 )
ggtern( df, aes(x = as.numeric(person1),
y = as.numeric(person2),
z = as.numeric(person3),
color = val) ) +
geom_point() +
scale_T_continuous( breaks = axis$tick, labels = axis$name ) +
scale_L_continuous( breaks = axis$tick, labels = axis$name ) +
scale_R_continuous( breaks = axis$tick, labels = axis$name ) +
labs( x = 'person1', y = 'person2', z = 'person3' )
Which gives a rather odd result:
I would expect ten points which are located where the grid lines meet (since these are categorical variables).
Ideally, I would like to generate a heatmap-like plot, i.e. triangular tiles instead of points.
Any help is highly appreciated!
Ok, after some research into ternary plots I now understand that this is not how they are used.
This kind of plot makes sense in situations where different contributions of three variables, which always sum up to the same value, are considered.
For my particular use case, I am better off with a faceted bar plot:
This is still not perfect since there are are some combinations in the plot that never occur in the data (e.g. (Alice, Carla, Carla)), but it does the job.
If anybody knows a better visualization for this use case I would be very much interested.