amazon-dynamodbamazon-dynamodb-streams

Do DynamoDB Streams using Lambda Triggers guarantee no duplicates?


I'm looking into the best approach for having a stream from our Dynamo tables and there is some conflicting answers/documentation over whether using the Dynamo Stream guarantees no duplicates.

The main streams doc page has a table that says for Dynamo Streams:

No duplicate records appear in the stream.

However when you get to the Streams with Lambda Best Practices doc page it says:

A Lambda consumer for a DynamoDB stream doesn't guarantee exactly once delivery and may lead to occasional duplicates. Make sure your Lambda function code is idempotent to prevent unexpected issues from arising because of duplicate processing.

The other implementation for DynamoDB streams is using a Kineses Adapter, but that doc page doesn't make any mention of duplication at all.

Is there some sort of duplication happening between the stream and the lambda triggering, or are one of these pages just outdated?


Solution

  • This blog post should answer your question in detail: https://aws.amazon.com/blogs/database/build-scalable-event-driven-architectures-with-amazon-dynamodb-and-aws-lambda/

    In short, DynamoDB provides exactly once delivery of events to the stream, however nothing prevents Lambda from processing the same batch more than once.

    To handle duplicate events, you can use Powertools for Lambda which provides idempotency protection.

    https://docs.powertools.aws.dev/lambda/python/latest/utilities/idempotency/