amazon-web-servicesaws-dms

Amazon DMS task with custom rules fails when sinking to Kinesis


I'm trying to listen to Aurora DB changes using Amazon DMS and push the changes to a Kinesis stream, where a Lambda function listening to the stream will do the processing.

I was referring to the below documentation to write my rules.

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Kinesis.html https://aws.amazon.com/blogs/database/use-the-aws-database-migration-service-to-stream-change-data-to-amazon-kinesis-data-streams/

Here is my rule mapping for the DMS Ongoing Replication (CDC) Task.

{
    "rules": [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator": {
                "schema-name": "my_db",
                "table-name": "my_table"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "object-mapping",
            "rule-id": "2",
            "rule-name": "2",
            "rule-action": "map-record-to-record",
            "object-locator": {
                "schema-name": "my_db",
                "table-name": "my_table"
            },
            "mapping-parameters": {
                "partition-key": {
                    "attribute-name": "my_id",
                    "value": "${my_id}"
                }
            }
        }
    ]
}

However, when I do a change in the source table, the DMS Task fails with the below error(s).

2019-02-05T10:36:55 [TARGET_APPLY ]E: Error allocating memory for Json document [1020100] (field_mapping_utils.c:382)
2019-02-05T10:36:55 [TARGET_APPLY ]E: Failed while looking for object mapping for table my_table [1020100] (kinesis_utils.c:258)
2019-02-05T10:36:55 [TARGET_APPLY ]E: Error executing data handler [1020100] (streamcomponent.c:1778)
2019-02-05T10:36:55 [TASK_MANAGER ]E: Stream component failed at subtask 0, component st_0_some_random_id [1020100] (subtask.c:1366)
2019-02-05T10:36:55 [TASK_MANAGER ]E: Task error notification received from subtask 0, thread 1 [1020100] (replicationtask.c:2661)
2019-02-05T10:36:55 [TASK_MANAGER ]W: Task 'some_random_task_id' encountered a fatal error (repository.c:4704)

When I try without the object-mapping rule, the Kinesis will get a record with "partitionKey": "my_db.my_table" with the correct values, which is the default behavior for the table-to-table sink, but we need table-to-kinesis sink.

Why do I care about the partition-key this much? Because I need to utilize all the shards in my Kinesis stream.

Can someone help me?

UPDATE:

When I add "partition-key-type": "schema-table" to the "mapping-parameters", it won't fail, the task doesn't fail, but ignores the "partition-key" attribute and will have "partitionKey": "my_db.my_table" as before.

Uncertain points:

  1. In table-to-table sinking, it uses "partition-key-type": "schema-table", but never mentions what's the value for table-to-kinesis.
  2. The samples and the explanations in the docs are very limited and even faulty (i.e. some of the rule JSON are invalid)

Solution

  • So, I'm answering my own question here.

    We got in touch with AWS Support team and they said it's an issue in their side and also the documentation doesn't reflect the exact functionality. They also raised a ticket internally and get it fixed in the future.

    For now, since DMS cannot cater to our expectation, we decided to move to a different solution.