pythondjangopostgresqllockingdjango-orm

Does "select_for_update()" work with the update method in Django?


The documentation for Django 2.2, which I'm using, gives the following example usage for select_for_update:

from django.db import transaction

entries = Entry.objects.select_for_update().filter(author=request.user)
with transaction.atomic():
    for entry in entries:
        ...

Using this approach, one would presumably mutate the model instances assigned to entry and call save on these.

There are cases where I'd prefer the alternative approach below, but I'm unsure whether it would work (or even make sense) with select_for_update.

with transaction.atomic():
    Entry.objects.select_for_update().filter(author=request.user).update(foo="bar", wobble="wibble")

The documentation states that the lock is created when the queryset is evaluated, so I doubt the update method would work. As far as I'm aware update just performs an UPDATE ... WHERE query, with no SELECT before it. However, I would appreciate it if someone more experienced with this aspect of the Django ORM could confirm this.

A secondary question is whether a lock even adds any protection against race conditions if one makes a single UPDATE query against the locked rows. (I've entered this train of thought because I'm refactoring code that uses a lock when updating the values of two columns of a single row.)


Solution

  • As far as I'm aware update just performs an UPDATE ... WHERE query, with no SELECT before it

    Yes, that's correct. You could confirm this by looking at the actual queries made. Using the canonical django tutorial "polls" app as an example:

    with transaction.atomic():
        qs = polls.models.Question.objects.select_for_update().all()
        qs.update(question_text='test')
    
    print(connection.queries)
    # {'sql': 'UPDATE "polls_question" SET "question_text" = \'test\'', 'time': '0.008'}
    
    

    So, as you expect, there is no SELECT.

    Though, ensuring the lock is acquired would be as simple as doing anything to cause the queryset to be evaluated.

    with transaction.atomic():
        qs = polls.models.Question.objects.select_for_update().all()
        list(qs) # cause evaluation, locking the selected rows
        qs.update(question_text='test')
    
    print(connection.queries)
    #[...
    # {'sql': 'SELECT "polls_question"."id", "polls_question"."question_text", "polls_question"."pub_date" FROM "polls_question" FOR UPDATE', 'time': '0.003'},
    # {'sql': 'UPDATE "polls_question" SET "question_text" = \'test\'', 'time': '0.001'}
    #]
    

    A secondary question is whether a lock even adds any protection against race conditions if one makes a single UPDATE query against the locked rows

    In general, yes. Whether it is necessary in a particular situation depends what kind of race condition you're worried about. The lock will prevent race conditions where another transaction may try to update the same row, for example.

    Race conditions can be avoided without locks, too, depending on the nature of the update/race condition. Sometimes a transaction is sufficient, sometimes it's not. You may also use expressions which are evaluated server-side on the db to prevent race conditions (e.g. using Django's F() expressions).

    There are also other considerations, like your db dialect, isolation levels, and more.

    Additional reference on race condition thoughts: PostgreSQL anti-patterns: read-modify-write cycles (archive)