python django postgresql django-models django-model-field

Django group by substring on a field

I have a Django, PostgreSQL project. I want to perform a group by on substring from a field. Ex I have a model as Model1 and its column as name. The value in the column can be:

ABC-w-JAIPUR-123
XYZ-x-JAIPUR-236
MNO-y-SURAT-23
DEF-q-SURAT-23

From the above values in name field, I want to group by second occurrence of - and ending with - in our case it would be: "JAIPUR", "SURAT"

Please let me know how to achieve this in Django.

UPDATE: So far, I have tried:

Model1.objects.annotate(substring=Substr('name', F('name').index('-')+1, (F('name', output_field=CharField())[:F('name').rindex('-')]).index('-'))).values('substring').annotate(count=Count('id'))

but this is giving error:

AttributeError: 'F' object has no attribute 'index'

Solution

Well, an F object ^[Django-doc] is just an object to reference to a column. It is not the string value of that column, so you can not call .index(…) on this for example.

But more bad news is that its equivalent, StrIndex ^[Django-doc] can not easily determine the second index, so we will have to fix that as well. We can however fix this by cutting the string twice, and thus work with:

from django.db.models import F, Value
from django.db.models.functions import Left, Length, Right, StrIndex

hyp = Value('-')
drop₁ = Right('name', Length('name') - StrIndex('name', hyp))
drop₂ = Right(drop₁, Length(drop₁) - StrIndex(drop₁, hyp))
drop₃ = Left(drop₂, StrIndex(drop₂, hyp) - 1)

Then we thus can for example annotate with:

Model₁.objects.annotate(foo=drop₃)

The drop₁ will thus have dropped ABC-, the second drop₂ drops w- and finally drop₃ will drop all the rest of the groups, so it thus returns JAIPUR instead.

That being said, please don't make fields that somehow group data together. Store the different items in different columns. Combining is often a lot easier than extracting.