amazon-web-servicesamazon-dynamodbdynamodb-queriessecondary-indexes

DynamoDB scan on seconday index (GSI)


I was reading the documentation on Scan and it prefaces with:

The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index.1

It made me wonder, under what circumstances would scanning a secondary index return a different set of records that the plain table would return?

Scan does not support KeyConditionExpression, only FilterExpression - which basically happens after the data has been retrieved.

So what would be the implication of scanning a GSI vs the table?


Solution

  • A scan may return different results on a GSI compared to a base table because a GSI can be sparse.

    A GSI has a different key set to the base table. Items are only written to the GSI if the GSI keys are present on an item. Otherwise it is omitted, meaning there can be less data in a GSI than a base table.

    Lets say a base table has partition attribute A and sort key B. A GSI on that table has partition attribute C and no sort key. If an item only has values for attributes A and B, but not C, that item will not appear in the GSI.

    EDIT: The example AWS use in the link I provided is an LSI. I used an example of a GSI as it was in the question. The principle is the same.