azureazure-cosmosdbazure-cosmosdb-sqlapi

Azure CosmosDB query using PartitionKey - inconsistent `Retrieved document count`


I am experiencing inconsistent Retrieved document count while I am querying Azure CosmosDB records, using different PartitionKey.

I have a table with PartitionKey=/ParitionKey.

In my dev environment, the records are distributed only between two different partition key values:

enter image description here

When I query the records using the two Partion key values, I do get a different Retrieved document count, implicitly different RU.

For one partition key (the one with 99% distribution), CosmosDB correctly matches only 1 document, making it efficient from a RU perspective.

For the same query, but using a different partition key, CosmosDB navigates trough all the documents, which also results in a higher (and inneficient) RU usage.

SELECT top 1 * FROM c
where c.PartitionKey = 'PARTITION_KEY_VALUE'

What can be the reason for this inconsistent behavior? enter image description here

enter image description here


Solution

  • Question Summary

    You say

    the records are distributed only between two different partition key values:

    • 1b871081-7f5d-414d-9743-afbf27236d3b - 108 records
    • 5744b90c-208a-4d72-9675-ec5906afef80 - 6 records

    The query stats show that when you do

    SELECT top 1 * 
    FROM c
    where c.PartitionKey = '1b871081-7f5d-414d-9743-afbf27236d3b'
    

    The "Retrieved document count" is 1 but when you do

    SELECT top 1 * 
    FROM c
    where c.PartitionKey = '5744b90c-208a-4d72-9675-ec5906afef80'
    

    The "Retrieved document count" is 109

    Explanation

    This pattern would indicate that it is not using an index seek on PartitionKey it is just scanning through documents and returning the first one that matches the predicate, it just so happens that it happens to scan all the documents in logical partition 1b871081-7f5d-414d-9743-afbf27236d3b first (presumably you inserted these in a contiguous batch).

    The partition key is not indexed by default when using an exclude rootpath strategy. The only default indexed paths with consistent indexing are id and _ts

    You should add

     {
         "path": "/partitionKey/?"
     }
    

    to your includedPaths section in that case.

    Regarding the fact that it is apparently processing each logical partition in turn I think this clumping is just an artefact of how you loaded the data originally.

    e.g. In the below for a collection hosted on a single physical partition I see a default ordering that does not return all documents from one partitionKey and then move onto the next, the ordering is in fact correlated to the original insert order (I used sequential id numbering when inserting to make this easy to see).

    enter image description here