amazon-web-servicesamazon-dynamodbnosqldata-modelingdynamodb-queries

DynamoDB Hot Paritions Key Prefix and Overloading GSIs


I'm trying to implement a single table design in DynamoDB and I've seen recommendations to use prefixes for partition keys to identify entities. For example, a car's record might have a partition key formatted as CAR#<GUID>.

Does using a static prefix like this lead to a hot partition? Specifically, if I have a large number of records (e.g., 500k cars), will all records with the prefix CAR# end up in the same partition, potentially affecting performance?

Additionally, if I create a Global Secondary Index (GSI) using an entity type as the partition key, would this setup also create a hot partition on that GSI? What is a good alternative to fetching all records of a specific entity without performing a scan operation?

I am concerned that using a consistent prefix for partition keys (like CAR#) could lead to hot partitions when scaled up. I would appreciate any insights or recommendations on best practices for avoiding this issue if it is one, particularly with regard to GSIs as well.

Primary Table

PK SK (entity) Attr
CAR#12345 Car Record for Car 12345
CAR#67890 Car Record for Car 67890
CAR#11111 Car Record for Car 11111
TRUCK#22222 Truck Record for Truck 22222

GSI Table

GSI PK GSI SK Description
Car CAR#12345 GSI Record for Type Car 12345
Truck TRUCK#22222 GSI Record for Type Truck 22222

Solution

  • While car#guid will not cause a hot partition, your index on car might. Each partition on DynamoDB can handle 1000 WCU and 3000 WCU.

    If you need more capacity for you index, you may choose to shard your category types, such as ,CAR#1, CAR#2 etc... assigning each category a random number between 0-N.

    N is defined by the throughput you expect per category/1000.