azure azure-cosmosdb azure-cosmosdb-sqlapi

Why are updates slow in session consistency and fast in strong consistency in Azure Cosmos Db?

I am using ycsb tool for benchmarking cosmos db in strong vs session consistency level. I have single write region in east us and a replica read region in west us. Container throughput is set to manual at 6000 RUs.

I have 2 clients on 2 vms (in east us region) and each using 50 threads. I have 200000 records each 1KB in size. I am running read-modify-write workload (workloadf in ycsb tool). These are my findings.

STRONG: 
 
[OVERALL], RunTime(ms), 1802004  
[OVERALL], Throughput(ops/sec), 166.4813174665539  
[READ], Operations, 300000  
[READ], AverageLatency(us), 201056.00650333334  
[READ], MinLatency(us), 1581  
[READ], MaxLatency(us), 2156543  
[READ], 95thPercentileLatency(us), 603135  
[READ], 99thPercentileLatency(us), 1172479  
[READ], Return=OK, 300000  
[READ-MODIFY-WRITE], Operations, 300000  
[READ-MODIFY-WRITE], AverageLatency(us), 298607.42890666664  
[READ-MODIFY-WRITE], MinLatency(us), 78080  
[READ-MODIFY-WRITE], MaxLatency(us), 2306047  
[READ-MODIFY-WRITE], 95thPercentileLatency(us), 717823  
[READ-MODIFY-WRITE], 99thPercentileLatency(us), 1265663  
[UPDATE], Operations, 300000  
[UPDATE], AverageLatency(us), 97547.07946666666  
[UPDATE], MinLatency(us), 75776  
[UPDATE], MaxLatency(us), 1281023  
[UPDATE], 95thPercentileLatency(us), 121151  
[UPDATE], 99thPercentileLatency(us), 128063  
[UPDATE], Return=OK, 300000  

SESSION:
 
[OVERALL], RunTime(ms), 1350937  
[OVERALL], Throughput(ops/sec), 222.0680905179146  
[READ], Operations, 300000  
[READ], AverageLatency(us), 28163.643546666666  
[READ], MinLatency(us), 1426  
[READ], MaxLatency(us), 1573887  
[READ], 95thPercentileLatency(us), 101503  
[READ], 99thPercentileLatency(us), 508415  
[READ], Return=OK, 300000  
[READ-MODIFY-WRITE], Operations, 300000  
[READ-MODIFY-WRITE], AverageLatency(us), 221261.26997333334  
[READ-MODIFY-WRITE], MinLatency(us), 85760  
[READ-MODIFY-WRITE], MaxLatency(us), 3274751  
[READ-MODIFY-WRITE], 95thPercentileLatency(us), 648703  
[READ-MODIFY-WRITE], 99thPercentileLatency(us), 1141759  
[UPDATE], Operations, 300000  
[UPDATE], AverageLatency(us), 193093.33994666667  
[UPDATE], MinLatency(us), 83712  
[UPDATE], MaxLatency(us), 3110911  
[UPDATE], 95thPercentileLatency(us), 626175  
[UPDATE], 99thPercentileLatency(us), 1129471  
[UPDATE], Return=OK, 300000

I am quite surprised by the results that updates are so fast in strong consistency and so slow in session. And the reverse for read operations. Although read-modify-write results are as expected.

Solution

Based on the comments:

Java SDK 4.8 is pretty old. The recommended version is 4.64.0. This ideally is something that needs upgrading to obtain any worthy updated comparison.
If Update is Read->Update. With Session consistency, you can read a replica that is lagging behind and that involves a retry in another replica. This process has been greatly improved in newer versions of the SDKs. With Strong, that doesn't happen.
Without the diagnostics it's hard to identify if the problem is latency on the Cosmos DB side or something driving the latency high inbetween.

If the VM is in the same region as the Cosmos DB endpoint, the latency seems a bit high though. Assuming you are running on Direct mode and the machine has enough resources, the P95 should be much lower for Read operations.