pythondjangofilterdjango-queryset

Django filter queryset __in for *every* item in list


Let's say I have the following models

class Photo(models.Model):
    tags = models.ManyToManyField(Tag)

class Tag(models.Model):
    name = models.CharField(max_length=50)

In a view I have a list with active filters called categories. I want to filter Photo objects which have all tags present in categories.

I tried:

Photo.objects.filter(tags__name__in=categories)

But this matches any item in categories, not all items.

So if categories would be ['holiday', 'summer'] I want Photo's with both a holiday and summer tag.

Can this be achieved?


Solution

  • Summary:

    One option is, as suggested by jpic and sgallen in the comments, to add .filter() for each category. Each additional filter adds more joins, which should not be a problem for small set of categories.

    There is the aggregation approach. This query would be shorter and perhaps quicker for a large set of categories.

    You also have the option of using custom queries.


    Some examples

    Test setup:

    class Photo(models.Model):
        tags = models.ManyToManyField('Tag')
    
    class Tag(models.Model):
        name = models.CharField(max_length=50)
    
        def __unicode__(self):
            return self.name
    
    In [2]: t1 = Tag.objects.create(name='holiday')
    In [3]: t2 = Tag.objects.create(name='summer')
    In [4]: p = Photo.objects.create()
    In [5]: p.tags.add(t1)
    In [6]: p.tags.add(t2)
    In [7]: p.tags.all()
    Out[7]: [<Tag: holiday>, <Tag: summer>]
    

    Using chained filters approach:

    In [8]: Photo.objects.filter(tags=t1).filter(tags=t2)
    Out[8]: [<Photo: Photo object>]
    

    Resulting query:

    In [17]: print Photo.objects.filter(tags=t1).filter(tags=t2).query
    SELECT "test_photo"."id"
    FROM "test_photo"
    INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
    INNER JOIN "test_photo_tags" T4 ON ("test_photo"."id" = T4."photo_id")
    WHERE ("test_photo_tags"."tag_id" = 3  AND T4."tag_id" = 4 )
    

    Note that each filter adds more JOINS to the query.

    Using annotation approach:

    In [29]: from django.db.models import Count
    In [30]: Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2)
    Out[30]: [<Photo: Photo object>]
    

    Resulting query:

    In [32]: print Photo.objects.filter(tags__in=[t1, t2]).annotate(num_tags=Count('tags')).filter(num_tags=2).query
    SELECT "test_photo"."id", COUNT("test_photo_tags"."tag_id") AS "num_tags"
    FROM "test_photo"
    LEFT OUTER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
    WHERE ("test_photo_tags"."tag_id" IN (3, 4))
    GROUP BY "test_photo"."id", "test_photo"."id"
    HAVING COUNT("test_photo_tags"."tag_id") = 2
    

    ANDed Q objects would not work:

    In [9]: from django.db.models import Q
    In [10]: Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer'))
    Out[10]: []
    In [11]: from operator import and_
    In [12]: Photo.objects.filter(reduce(and_, [Q(tags__name='holiday'), Q(tags__name='summer')]))
    Out[12]: []
    

    Resulting query:

    In [25]: print Photo.objects.filter(Q(tags__name='holiday') & Q(tags__name='summer')).query
    SELECT "test_photo"."id"
    FROM "test_photo"
    INNER JOIN "test_photo_tags" ON ("test_photo"."id" = "test_photo_tags"."photo_id")
    INNER JOIN "test_tag" ON ("test_photo_tags"."tag_id" = "test_tag"."id")
    WHERE ("test_tag"."name" = holiday  AND "test_tag"."name" = summer )