azureerror-handlingazure-functionsmessage-queue

How to handle an Azure Function rerunning when using message queue binding?


I have a v1 Azure Function that is triggered by a message being written to the Azure Storage message queue.

The Azure Function needs to perform multiple updates to SharePoint Online. Occasionally these operations fail. This results in the message being returned to the queue and being reprocessed.

When I developed the function, I didn't consider that it might partially complete and then restart. I've done a little research and it sounds like I need to modify it to be re-entrant.

Is there a design pattern that I should follow to cater for this without having to add a lot of checks to determine if an operation has already been carried out by a previous execution? Alternatively, is there any Azure functionality that can help (beyond the existing message retries and poison queue)


Solution

  • It sounds like you will need to do some re-engineering. Our team had a similar issue and wrote a home-grown solution years ago. But we eventually scrapped our solution and went with Azure Durable Functions.

    Not gonna lie - this framework has some complexity and it took me a bit to wrap my head around it. Check out the function chaining pattern.

    We have processing that requires multiple steps that all must be completed. We're spanning multiple data stores (Updating Cosmos Db, Azure SQL, Blob Storage, etc), so there's no support for distributed transactions across multiple PaaS offerings. Durable Functions will allow you to break your process up into discrete steps. If a step fails, an orchestrator will re-run that step based on a retry policy.

    So in a nutshell, we use Durable Task Activity functions to attempt each step. If the step fails due to what we think is a transient error, we retry. If it's an unrecoverable error, we don't retry.