pythondjangodjango-querysetcontainsm2m

Django Many2Many query to find all `things` in a group of `categories`


Given these Django models:

from django.db import models

class Thing(models.model):
    name = models.CharField('Name of the Thing')

class Category(models.model):
    name = models.CharField('Name of the Category')
    things = models.ManyToManyField(Thing, verbose_name='Things', related_name='categories')

Note that all the categories a Thing is in can be found by:

thing = Thing.objects.get(id=1) # for example
cats = thing.categories.all() # A QuerySet

I'm really struggling to build a query set that returns all Things in all of a given set of Categories.

Let's say we have 5 categories, with IDs 1, 2, 3, 4, 5.

And say I have a subset of categories:

my_cats = Category.objects.filter(id__in=[2,3])

I want to find all Things that are in say categories, 2 AND 3.

I can find all Things in category 2 OR 3 easily enough. For example this:

Thing.objects.filter(categories__in=[2,3])

seems to return just that, Things in category 2 OR 3.

And something like:

Thing.objects.filter(Q(categories=2)|Q(categories=3))

also, but this returns nothing:

Thing.objects.filter(Q(categories=2)&Q(categories=3))

I might envisage something like:

Thing.objects.filter(categories__contains=[2,3])

but of course that's a dream as contains operates on strings not ManyToMany sets.

Is there a standard trick here I'm missing?

I spun up a sandbox here to test and demonstrate:

https://codesandbox.io/p/sandbox/django-m2m-test-cizmud

It implements this simple pair of models and populates the database with a small set of things and categories and tests the queries, here's the latest state of it:

print("Database contains:")
for thing in Thing.objects.all():
    print(
        f"\t{thing.name} in categorties {[c.id for c in thing.categories.all()]}")
print()

# This works fine. Prints:
# Cat1 OR Cat2: ['Thing 1', 'Thing 5', 'Thing 4']
things = Thing.objects.filter(
    Q(categories=1) | Q(categories=2)).distinct()
print(f"Cat1 OR Cat2: {[t.name for t in things]}")

# We would love this to return Thing4 and thing5
# The two things in the test data set that are in
# Category 2 and in Category 3.
# But this does not work. It prints:
# Cat2 AND Cat3: []
# because
# What does yield ['Thing 4', 'Thing 5']?
print("\nAiming to to get: ['Thing 4', 'Thing 5']")
things = Thing.objects.filter(
    Q(categories=2) & Q(categories=3)).distinct()
print(f"Try 1: Cat2 AND Cat3: {[t.name for t in things]}")

# This also fails, producing an OR not AND
things = Thing.objects.filter(categories__in=[2, 3]).distinct()
print(f"Try 2: Cat2 AND Cat3: {[t.name for t in things]}")

# Also fails
things = Thing.objects.filter(categories__in=[2, 3])\
                      .filter(categories=2).distinct()
print(f"Try 3: Cat2 AND Cat3: {[t.name for t in things]}")

# Also fails
things = Thing.objects.filter(categories__in=[2, 3], categories=2)\
                      .distinct()
print(f"Try 4: Cat2 AND Cat3: {[t.name for t in things]}")

and it's output:

Database contains:
        Thing 1 in categorties [1, 2]
        Thing 2 in categorties [3, 4]
        Thing 3 in categorties [5]
        Thing 4 in categorties [2, 3]
        Thing 5 in categorties [1, 2, 3]

Cat1 OR Cat2: ['Thing 1', 'Thing 5', 'Thing 4']

Aiming to to get: ['Thing 4', 'Thing 5']
Try 1: Cat2 AND Cat3: []
Try 2: Cat2 AND Cat3: ['Thing 1', 'Thing 4', 'Thing 5', 'Thing 2']
Try 3: Cat2 AND Cat3: ['Thing 1', 'Thing 4', 'Thing 5']
Try 4: Cat2 AND Cat3: ['Thing 1', 'Thing 4', 'Thing 5']

I guess if I can work it out in SQL, we can write us a custom lookup:

https://docs.djangoproject.com/en/4.2/howto/custom-lookups/

But why do I think this must already have been written? How this be such a unique and new use case?


Solution

  • Thing.objects.annotate(cat_count=Count('id', filter=Q(categories__in=[2, 3]))).\
            filter(cat_count__gte=2).values('id', 'cat_count')
    

    Try it. cat_count__gte=2 because the two numbers are 2, 3.