pythonplotsetupsetplot

UpSetPlot from actual sets


I want to use the UpSetPlot given the actual sets I have, but I cannot find any example to use it this way. The standard example is this:

from upsetplot import generate_counts, plot
example = generate_counts()
plot(example, orientation='vertical')

where generated example is a Series looking like below.

cat0   cat1   cat2 
False  False  False      56
              True      283
       True   False    1279
              True     5882
True   False  False      24
              True       90
       True   False     429
              True     1957
Name: value, dtype: int64

Is there a way to automatically generate this kind of count structure from the actual elements in the categories cat0, cat1, and cat2?


Solution

  • Using the tip by @StupidWolf in another answer, here is an answer to my own question. Given 3 sets

    set1 = {0,1,2,3,4,5}
    set2 = {3,4,5,6,10}
    set3 = {0,5,6,7,8,9}
    

    here is the code to draw an upsetplot for these three sets:

    import pandas as pd
    from upsetplot import plot
    set_names = ['set1', 'set2', 'set3']
    all_elems = set1.union(set2).union(set3)
    df = pd.DataFrame([[e in set1, e in set2, e in set3] for e in all_elems], columns = set_names)
    df_up = df.groupby(set_names).size()
    plot(df_up, orientation='horizontal')
    

    enter image description here

    And here is the 4th and 5th line changed to generalize above code to a list of sets, say sets = [set1, set2, set3]:

    all_elems = list(set().union(*sets))
    df = pd.DataFrame([[e in st for st in sets] for e in all_elems], columns = set_names)