djangodjango-celeryread-replication

django, multi-databases (writer, read-reploicas) and a sync issue


So... in response to an API call I do:

i = CertainObject(paramA=1, paramB=2)
i.save()

now my writer database has a new record.

Processing can take a bit and I do not wish to hold off my response to the API caller, so the next line I am transferring the object ID to an async job using Celery:

run_async_job.delay(i.id)

right away, or a few secs away depending on the queue run_async_job tried to load up the record from the database with that ID provided. It's a gamble. Sometimes it works, sometimes doesn't depending whether the read replicas updated or not.

Is there pattern to guarantee success and not having to "sleep" for a few seconds before reading or hope for good luck?

Thanks.


Solution

  • The simplest way seems to be using the retries as mentioned by Greg and Elrond in their answers. If you're using shared_task or @app.task decorators, you can use the following code snippet.

    @shared_task(bind=True)
    def your_task(self, certain_object_id):
        try:
            certain_obj = CertainObject.objects.get(id=certain_object_id)
            # Do your stuff
        except CertainObject.DoesNotExist as e:
            self.retry(exc=e, countdown=2 ** self.request.retries, max_retries=20)
    

    I used an exponential countdown in between every retry. You can modify it according to your needs.

    You can find the documentation for custom retry delay here. There is also another document explaining the exponential backoff in this link

    When you call retry it’ll send a new message, using the same task-id, and it’ll take care to make sure the message is delivered to the same queue as the originating task. You can read more about this in the documentation here