pythondjangodjango-queryset

Filter a query set depending on state at a given date


Given the following model (using django-simple-history):

class MyModel (models.Model):
    status = models.IntegerField()
    history = HistoricalRecords()

I would like to get all instances that didn't have a certain status on a given date (i.e. all instances that had a different status on the limit date, plus all instances that didn't exist at that time).

The following query will return all instances that never had status = 4 at any point before the limit date:

MyModel.filter (~Exists (
    MyModel.history.filter (
        id = OuterRef ("id"),
        history_date__lte = limit_date,
        status = 4))

But unfortunately it also removes instances that had status = 4 at some past date, then changed to a different status by the limit date, and I want to keep those.

The following should give the correct result:

MyModel.filter (~Exists (
    MyModel.history.filter (
        id = OuterRef ("id"),
        history_date__lte = limit_date)
   .order_by ("-history_date")
   [:1]
   .filter (status = 4)))

Unfortunately it doesn't work: Cannot filter a query once a slice has been taken. This question links to this documentation page which explains that filtering is not allowed after the queryset has been sliced.

Note that the error comes from an assert in Django. If I comment out the assert in django/db/models/query.py:953, then the code appears to work and gives the expected result. However commenting out an assert in an upstream dependency is not a viable solution in production.

So is there a clean way to filter my queryset depending on some past state of the object?


Solution

  • The history model only saves a record when the item changed, not every day. We can thus obtain the status at a given date with:

    from django.db.models import OuterRef, Q, Subquery
    
    MyModel.annotate(
        historic_status=Subquery(
            MyModel.history.filter(id=OuterRef('id'), history_date__lte=limit_date)
            .order_by('-history_date')
            .values('status')[:1]
        )
    ).filter(~Q(history_status=4) | Q(history_status=None))

    We thus first look for the status of the historic model with a date before or equal to limit_date. By ordering it with the most recent history_date first, we thus get the most recent status.

    This will thus set historic_status to the status at the time of the limit_date, or, in case the record does not exist at that time, NULL (None).

    We then thus can filter the MyModels that have thus history_status not four (and we added the NULL check explicitly), although normally the following should be sufficient:

    from django.db.models import OuterRef, Q, Subquery
    
    MyModel.annotate(
        historic_status=Subquery(
            MyModel.history.filter(id=OuterRef('id'), history_date__lte=limit_date)
            .order_by('-history_date')
            .values('status')[:1]
        )
    ).filter(~Q(history_status=4))