Recently, we had one of our clients running an excessive amount of transaction API calls to the dedicated Event Hubs blob storage account. I believe they were using Event Hub Event Processor to retrieve the data. At one point For over 6 hours they ran over 300k transaction calls to ListBlobs API. These additional transactions added a $50 daily cost to our subscription.
Is there a way for us to limit/cap the transactions on Event Hub's related storage accounts ?
I have already set up an Alert rule to get notified when blob transaction reaches the threshold.
Thanks
Blob storage is used by processor types to track partition ownership and is critical infrastructure for load balancing. Your options to reduce calls:
Increase the LoadBalancingUpdateInterval on your processor options such that load balancing cycles run less frequently. The tradeoff that you're making here is that your processor will be slower to react to other processors crashing or removed from the cluster when scaling down, leading to some events being delayed for processing.
Use static partition assignment and remove the need for load balancing. (You would also want to set the interval to an infinite or very large duration.) The tradeoff here is that your processor will only ever work with the assigned partitions. This is generally recommended only when you're running in an orchestrated environment, like Kubernetes, where an orchestrator has responsibility for monitoring node health and restarting dead nodes.
Create your own CheckpointStore implementation that does not use Blob storage for ownership-related activities and extend PluggableCheckpointStoreEventProcessor. The tradeoff is just the extra work needed for the implementation. The processor doesn't care where the data comes from.
Default configuration:
Calls to Event Hubs:
Calls to Storage:
Load balancing is best thought of as a naive competing consumer that runs in a giant while loop on a predictable schedule. It attempts to maintain a fair balance of the work by prioritizing its own "fair share" of partitions and will favor stability if there isn't a clear imbalance.
At a high level, each cycle will:
Query Event Hub partitions; update state
Renew Local Ownership
List All Ownership
Calculate Ownership
Claim Ownership
Determine balance stability
Ensure owned partitions are being processed
Calculate next cycle time