pythondjangodjango-modelsdjango-ormdjango-postgresql

Django ORM aggregate over related array field


I have two models

class Record(Model):
    scorable_entry = models.ForeignKey('Entry',
                                       null=True,
                                       blank=True,
                                       on_delete=models.CASCADE)

class Entry(Model):
    scores = ArrayField(models.IntegerField(), null=True)

and I need to sort Records based on the sum of scores on a related Entry model.

Unfortunately, this naive code throws an error

records
.annotate(score_rows=Func(F('scorable_entry__scores'), function='unnest'))
.annotate(scores_sum=sum('score_rows'))
.order_by('-scores_sum')
django.db.utils.NotSupportedError: aggregate function calls cannot contain set-returning function calls

I'm using unnest to convert array to rows first (because otherwise sum wouldn't work). Skipping unnesting doesn't work as sum doesn't operate on arrays

django.db.utils.ProgrammingError: function sum(integer[]) does not exist
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

What is the proper way to order elements by the sum of a related array using ORM?

Django 3.1, Postgres 12.9


Solution

  • You can create postgres function to sum up int[] and use it in annotation

    create or replace function int_array_sum(arr int[])
    returns int
    language plpgsql
    as
    $$
    declare
        result integer;
    begin
       select sum(a) into result
       from unnest(arr) a;
    
        return result;
    end;
    $$;
    

    Here the query

    Record.objects
        .annotate(scores_sum=Func(F('scorable_entry__scores'), function='int_array_sum'))
        .order_by('-scores_sum')