I would like to understand how partition distribution is handled behind the Azure functions cosmosDb trigger. I have two Azure functions with same lease prefix listening to the change feed. Let's say the container has N partitions, how exactly the partition distributes among the two function apps. I read that the apps try to acquire the lease over the partitions. What's preventing a single function app to acquire lease over all partition and make the other app sit idle ? Does the change feed processor library underneath the functions communicate via the lease container to prevent this ?
You can see the source code on github.
The PartitionLoadBalancer runs in a loop and periodically checks for leases to take.
For EqualPartitionsBalancingStrategy this calculates a target number of leases to process based on the number of leases (physical partitions) and the number of other hosts recorded in the leases collection as lease owners.
If the host already owns sufficient leases for its target it doesn't take any. Otherwise it will try and get up to its target from taking leases that have no owner or that have expired. Only if that comes up empty will it steal a single lease that the busiest other processor already owns.
This process will run again after a delay of LeaseAcquireInterval
I haven't dug in that deep but assume the "theft" is just communicated by updating the relevant lease in the leases collection and the original owner comes to notice it next time they try and update it.