pythonamazon-web-servicesaws-lambdaamazon-kinesisdms

AWS DMS inserts duplicate records into kinesis and S3


I have configured a DMS migration instance that replicates data from Mysql into a AWS Kinesis stream, but I noticed that when I process the kinesis records I pick up duplicate records.This does not happen for every record.

How do I prevent these duplicate records from being pushed to the kinesis data stream or the S3 bucket?

I'm using a lambda function to process the records, so I thought of adding logic to de-duplicate the data, but I'm not sure how to without persisting the data somewhere. I need to process the data in real-time so persisting the data would not be idle.

Regards Pragesan


Solution

  • I added a global counter variable that stores the pk of each record,so each invocation checks the previous pk value,and if it is different I insert the value.