djangomany-to-manyorphanorphaned-objects

Efficiently delete orphaned m2m objects/tags in Django


I have two models - Photo and Tag - which are connected via a ManyToManyField.

class Photo(models.Model):
    tags = models.ManyToManyField(Tag)

class Tag(models.Model):
    lang = models.CharField(max_length=2)
    name_es = models.CharField(max_length=40)
    name_en = models.CharField(max_length=40)

Every once in a while, we get orphaned tags, that are not referenced any more by any photo. Is there an efficient way of deleting those tags? I know about this answer: Django: delete M2M orphan entries?

And our solution looks like this at the moment:

for tag in Tag.objects.all():
    if not tag.photo_set.select_related(): tag.delete()

However, with increasing database, the runtime of this script is becoming distressingly high :-P Is there an efficient way of getting a list of all tag IDs from the tags table and then a list of all tag IDs from the many-to-many table to create an intersection list?


Solution

  • Try sub-query w/ intermediate table

    qs = Tag.objects.exclude(pk__in=Book.tags.through.objects.values('tag'))
    
    # then you could
    qs.delete()
    
    # or if you need to trigger signal per item
    for x in qs:
        x.delete()