This is my situation:
library(UpSetR)
movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), header = TRUE, sep = ";")
upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western"),
order.by = "freq")
I would like to improve the plot by removing variables (genres) that are displayed alone, without any intersections with other variables.
How can I modify the code to remove these isolated variables as specified below?
You can filter them out of the data before you draw the plot. For example
sets <- c("Action", "Adventure", "Comedy", "Drama", "Mystery", "Thriller", "Romance", "War", "Western")
# keep only rows with more than 1 value
reduced_data <- movies[rowSums(movies[, sets]) > 1, ]
# or with dplyr...
# reduced_data <- movies %>% filter(rowSums(pick(all_of(sets)))>1)
upset(reduced_data, sets = sets,
order.by = "freq")