pythonpandasmatplotlibmatplotlib-venn

Creating a three way venn diagram where one set is completly inside another set?


Is there a way to show the three-way Venn diagram using matplot where set A is completely inside set C but the two sets intersect with B without showing the 0 values?

Code:

import matplotlib.pyplot as plt
from matplotlib_venn import venn3

A = len(df_A)
B = len(df_B)
C = len(df_C)

# Get the set of 'path' values for each dataframe
set_path_A = set(df_A['path'])
set_path_B = set(df_B['path'])
set_path_C = set(df_C['path'])

# Calculate the intersections
AB = len(set_path_A & set_path_B)
AC = len(set_path_A & set_path_C)
BC = len(set_path_B & set_path_C)
ABC = len(set_path_A & set_path_B & set_path_C)

# Create the Venn diagram
venn_labels = {'100': A - AB - AC + ABC, '010': B - AB - BC + ABC, '001': C - AC - BC + ABC,
               '110': AB - ABC, '101': AC - ABC, '011': BC - ABC, '111': ABC}

plt.figure(figsize=(8, 8))
venn_diagram = venn3(subsets=(A, B, AB, C, AC, BC, ABC), set_labels=('A', 'B', 'C'))
venn_diagram.get_label_by_id('100').set_text(venn_labels['100'])
venn_diagram.get_label_by_id('010').set_text(venn_labels['010'])
venn_diagram.get_label_by_id('001').set_text(venn_labels['001'])
venn_diagram.get_label_by_id('110').set_text(venn_labels['110'])
venn_diagram.get_label_by_id('101').set_text(venn_labels['101'])
venn_diagram.get_label_by_id('011').set_text(venn_labels['011'])
venn_diagram.get_label_by_id('111').set_text(venn_labels['111'])

plt.title("Three-Way Venn Diagram")
plt.show()

When running the code it shows the 0 values instead of plotting the set inside. Matplot Venn Diagram

However, I want to show it like this: What I want

What will be the correct way to modify so that set A is inside set C? (Without labeling and showing 0 values)

posted code in the question


Solution

  • If I just plug your desired region sizes into venn3 appropriately, I get the following:

    from matplotlib_venn import venn3
    subsets = (0, 25_917_041, 0, 7_937_647, 26_016_768, 109_256, 362_049)
    venn3(subsets)
    

    Admittedly, this layout makes the numbers somewhat unreadable, but it is a fair representation of your actual areas, which is what you probably want in the first place.

    You can tune the diagram in a few ways:

    1. By making the numbers more compact (do you really care for 6 digits of precision?) and moving the central value a bit up, e.g.:

      d = venn3(subsets, subset_label_formatter = lambda x: f"{x/1000000:0.1f}M")
      d.get_label_by_id('111').set_verticalalignment('bottom')
      

    2. By sacrificing some of the area-weighting correctness and artificially inflating the A&B&C region to make more space for the labels there[1]:

      from matplotlib_venn import venn3_unweighted
      tuned_subsets = list(subsets)
      tuned_subsets[6] += 1_000_000
      venn3_unweighted(subsets, subset_areas=tuned_subsets)
      


    [1] The most recent release deprecates the venn3_unweighted function and the correct alternative will be:

    from matplotlib_venn.layout.venn3 import DefaultLayoutAlgorithm
    venn3(subsets, layout_algorithm=DefaultLayoutAlgorithm(fixed_subset_sizes=tuned_subsets))