amazon-dynamodb

DynamoDB GSI: All attributes projection vs keys-only with multiple read


Which is more cost effective? Let's say I've the following table

CompanyId [Hash Key]
ItemId [Range]
Last_modified
[...other attributes]

I've also created a GSI where CompanyId is my [Hash Key] and Last_modified is my [Range] so that I can query based on Last_modified date.

For my usage, my GSI is querying from another dimension, i.e. I need all attributes from my GSI query and same amount of query as against the main table.

So ... which is more cost-effective:

  1. Set attribute projection to All? --> increase in storage cost, but lesser read
  2. Set attribute projection to Keys-only, and for each key, do another read to the main table to get all attributes? --> lesser storage cost, but more read

Solution

  • There is no straightforward answer: It depends on your ratio of updates to reads, the size of the other attributes, duration of storage, and whether you are provisioning capacity.

    Broadly speaking, storage is very cheap and writes are more expensive than reads in dynamodb (at least 5x, or 10x for eventually consistent reads in normal tables). For projected attributes over 1kB you consume additional write request units to keep the GSI up to date, so if you have large items and ratio of writes to reads more than 1:10, updating the index will become a significant cost.

    If you have far more than 10x as many reads as writes, then projecting all attributes into the index will be cheaper, and the more so the smaller your items.

    You can use ReturnConsumedCapacity on put and get operations to check the read/write capacity units consumed by both read and write operations (including those consumed updating indexes).

    For storing just the IDs, note that you can also BatchGetItem up to 100 items at once, which is much faster than getting individually.