djangodjango-modelsdjango-querysetdjango-annotatedjango-postgresql

Django Queryset: group by two field values


I have a model like this

Class ExchangeMarket(models.Model):
base_currency = models.ForeignKey(Currency)
quote_currency = models.ForeignKey(Currency)
... various other fields

And some entries like

base: EUR, quote: USD ...
base: EUR, quote: USD ...
base: USD, quote: EUR ...
base: BTC, quote: USD ...
...

I need to find entries sharing the base/quote combination, i.e. in my example that's the first three

How to create such a queryset? If I do something like

ExchangeMarket.objects.all().values('base_currency', 'quote_currency').annotate(pks=ArrayAgg('pk'))

I will only find exact matches with same bases and quotes (like EUR/USD, first two of my entries) while I need to find both EUR/USD and USD/EUR. Thanks!


Solution

  • You can work with Least [Django-doc] and Greatest [Django-doc] here:

    from django.db.models.functions import Greatest, Least
    
    ExchangeMarket.objects.values(
        cur1=Least('base_currency', 'quote_currency'),
        cur2=Greatest('base_currency', 'quote_currency'),
    ).annotate(pks=ArrayAgg('pk'))

    This will then add the primary key to the two same items.

    That being said, I would advise to do some post-processing in Python, and work with .groupby(…) [Python-doc]:

    from itertools import attrgetter
    
    from django.db.models.functions import Greatest, Least
    
    items = {
        k: list(vs)
        for k, vs in groupby(
            ExchangeMarket.objects.annotate(
                cur1=Least('base_currency', 'quote_currency'),
                cur2=Greatest('base_currency', 'quote_currency'),
            ).order_by('cur1', 'cur2'),
            attrgetter('cur1', cur2),
        )
    }

    We even can do additional processing where we also add the (k2, k1) entry to the dictionary, if the (k1, k2) entry is already passed:

    for k1, k2 in list(items):
        items[k2, k1] = items[k1, k2]

    This means we can obtain the ExchangeMarket objects for items['EUR', 'USD'] and items['USD', 'EUR'].