azureazure-cosmosdbdata-partitioning

How to decide a good partition key for Azure Cosmos DB


I'm new to Azure Cosmos DB, but I want to have a vivid understanding of:

  1. What is the partition key?

My understanding is shallow for now -> items with the same partition key will go to the same partition for storage, which could better load balancing when the system grows bigger.

  1. How to decide on a good partition key? Could somebody please provide an example?

Thanks a lot!


Solution

  • 1.What is the partition key?

    In azure cosmos db , there are two partitions: physical partition and logical partition

    A.Physical partition is a fixed amount of reserved SSD-backed storage combined with variable amount of compute resources.

    B.Logical partition is a partition within a physical partition that stores all the data associated with a single partition key value.

    I think the partiton key you mentioned is the logical partition key.The partition key acts as a logical partition for your data and provides Azure Cosmos DB with a natural boundary for distributing data across physical partitions.More details, you could refer to How does partitioning work.

    2.How to decide a good partition key? Could somebody please provide an example?

    You need consider to pick a property name that has a wide range of values and has even access patterns.An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable.

    For example, your data has fields named id and color and you query the color as filter more frequently. You need to pick the color, not id, for partition key, which is more efficient for your query performance. Because every item has a different id but maybe has same color. It has a wide range. Also if you add a color, the partition key is scalable.

    More details ,please read the Partition and scale in Azure Cosmos DB.

    Hope it helps you.