azure-functionsazure-cosmosdbazure-triggersazure-cosmosdb-changefeedcosmosdbtrigger

CosmosDBTrigger reliably processing each document exactly once?


My customer would like to use a CosmosDBTrigger to transfer documents to an Azure Service Bus. In this scenario it's important to have a 1:1 relationship between Cosmos item mutations and Service Bus messages. So, each document that the trigger receives (via batch) must be processed only once, this leads to some fundamental questions I've been unable to confirm:

If the net answer is that this trigger is not reliable, just curious what are the intended use cases for its use?

Thanks

-John


Solution

  • Short answer is no. The Cosmos DB Trigger has an "at least once" delivery, which means that an item could be, in some cases, delivered more than once.

    1. As per the link below (and aligning with other Event based Azure Functions Triggers), unhandled exceptions won't cause the batch to retry. There is a design proposal from the Functions team to make all Event based Triggers have a retry configuration (https://github.com/jeffhollan/retry-design), once that is applied, then you will be able to define a retry policy for the Cosmos DB Trigger too.
    2. If there is a runtime issue that stops the current batch at any point in your Function code, when the runtime starts again, it will retry the batch entire batch, there is no way to know up to which point in the batch you read or processed, because the Function user code is private. The life cycle of the Function checkpoints on the lease store after a complete execution of the Function, if the runtime stops in the middle (because you manually stopped it or due to some event that stopped the runtime), then the checkpoint did not happen, and the lease store has the previous mark.
    3. As mentioned previously, there is no retry behavior on Event-based triggers, so there is no way you can retry it based on some condition.
    4. Yes, as mentioned, the Trigger has an "at least once" delivery. The same document could be processed again if the amount of instances is changing due to an scaling event, where the Trigger will rebalance the leases across new instances.

    For reference, see the Troubleshooting guide: https://learn.microsoft.com/azure/cosmos-db/troubleshoot-changefeed-functions.