amazon-web-servicesamazon-dynamodbdynamodb-queries

DynamoDB not as scalable as it seems?


it is known that DynamoDB is highly scalable. Let's take a look at 2 scenarios

Scenario 1: I create a DynamoDB table with on-demand mode

Scenario 2: I create a DynamoDB table with provisioned mode (1500 RCUs and 500 WCUs)

Now let's say a super violent write traffic spike happens: 25 000 write operations in 1 second.

Regarding the scenario 1, when creating an on-demand table, there are only 4 partitions that are automatically created (or a number of partitions close to that value). Therefore even if the PutItem API call is using different partition keys (evenly distributed), the clients will get throttling errors. (DynamoDB will have to split partitions to cope with the load, this takes some time, hence the clients get throttling errors in the meantime).

Regarding scenario 2, there is only 1 partition, so all the PutItems will store data in that partition. DynamoDB will not be able to cope with the load, even with the burst capacity, the hard limit of a partition is 1000 WCUs. I will have to increase the WCU of the table.

Therefore is it correct to say that:

When creating a DynamoDB table using on-demand, if there is an extremely violent traffic spike (PutItem for instance), DynamoDB is not able to accept all the PutItem, DynamoDB will have to split partitions to get more WCUs. In the meantime DynamoDB will respond throttling errors

Also, how long does it takes for DynamoDB to split a partition ? (just so I can get an idea how long a client has to wait with throttling errors)


Solution

  • Trust me, DynamoDB is as scalable as it seems. Let me help with your concerns:

    Scenario 1

    You create an on-demand table, which you get 4 partitions to serve your data. Each partition is capable of handling 1000 WCU and 3000 RCU. So for a new on-demand table with evenly distributed writes, you will see throttling if you try to exceed 4000 WCU instantly. For this reason we have documented how to pre-warm your on-demand table to ensure it meets the scale you require out of the box.

    Pre-warming

    With on-demand capacity mode, the requests can burst up to double the previous peak on the table. Note that throttling can occur if the requests spikes to more than double the default capacity or the previously achieved peak request rate within 30 minutes. One solution is to pre-warm the tables to the anticipated peak capacity of the spike.

    To pre-warm the table, follow the steps outlined in this blog:

    Set a warm throughput value by using the AWS Management Console:

    1. Navigate to the DynamoDB console and choose Create table.
    2. Specify your table’s primary key attributes.
    3. Under Table settings, select Customize settings.
    4. For Read/write capacity settings, choose On-demand.
    5. Under Warm Throughput, input your anticipated maximum read and write request units.
    6. Complete the table creation process.

    Scenario 2

    You provision your table to have 500 WCU, but you expect it to be able to provide you more than 1 single partition? If you have a single partition you are correct that you can only burst to its hard limit, which is 1000 WCU.

    If you scale beyond 1000 WCU, then DynamoDB will add new partitions which can sometimes take a couple of seconds/minutes. However, just like scenario 1, if you expect to reach a peak of 30,000 WCU, then pre-warm your table to that amount. Now when your autoscaling needs more than 1000 WCU, it can scale instantly.


    TLDR; Pre-warm your DynamoDB table if you expect to scale, that way you can scale seamlessly from the beginning.