amazon-web-servicesaws-lambdaamazon-kinesis

Why is this Lambda function not running more concurrent executions?


I have an AWS Lambda function connected to an AWS Kinesis data stream. The volume of data in the stream has recently scaled up significantly, but the number of concurrent executions of the Lambda function has not increased. "Concurrent executions" always hovers around 2-3, even though you can see that the iterator is backed up with quite a few records waiting to be processed in the queue: (https://i.sstatic.net/wdBhSgY8.png)

I have tried setting up this Lambda function/Kinesis stream with 4 Kinesis shards + 10 parallel executions per shard and even tried 10 provisioned concurrencies in Lambda. Still, it never goes past 3 concurrent executions.

The AWS documentation here says: "For example, when you set ParallelizationFactor to 2, you can have 200 concurrent Lambda invocations at maximum to process 100 Kinesis data shards (though in practice, you may see different values for the ConcurrentExecutions metric)." Why "in practice" might we see different values for concurrent executions? Why are mine stuck at 2-3?

I am sure that I am just failing to understand some basic Lambda concept, but I can't find a good explanation anywhere. What is the variable that triggers Lambda to run more concurrent executions at one time? How can I get my Lambda function to run more concurrent executions?

Details on my Lambda function and Kinesis stream:

Lambda function is a Python 3.9 function

(https://i.sstatic.net/zOiRUoq5.png)

(https://i.sstatic.net/bmO5z6cU.png)

(https://i.sstatic.net/Yub4Vjx7.png)

(https://i.sstatic.net/ZfI7N4mS.png)


Solution

  • I was able to determine through further research that the limiting variable for the number of Kinesis shards and concurrent Lambda executions was the number of partition keys in my Kinesis stream.

    All records with the same partition key will always go to the same Kinesis shard. Thus, even if you have 100 shards, if you have 1 partition key in your records, then you will only ever be using 1 shard. Even if your data exceeds the capacity of this shard, Kinesis will not use another shard.

    Similarly, Lambda refuses to process multiple batches of records with the same partition key concurrently in order to maintain the correct order of the events in the stream. If Lambda processed 100 batches of records with the same partition key concurrently, it could no longer guarantee that it would be processing them in the correct order.

    Thus, given that all records for one partition key go to the same Kinesis shard and that Lambda will never concurrently execute multiple batches of the same partition key, then we may wonder: why is there a setting for "concurrent executions per shard" in Lambda? The answer is that multiple different partition keys can all go to a single Kinesis shard. For example, you could have 3 different partition keys in your incoming data, and if the quantity of data is small enough (less than 1,000 MB/second, e.g.), then all of the records with all 3 of these partition keys could go to the same Kinesis shard. Then, because Lambda can process different partition keys concurrently, it will run 3 concurrent batches per the 1 shard (one for each partition key on that shard).

    In conclusion, to run concurrent executions in Lambda, you have to specify multiple different partition keys in your incoming data. Similarly, to make use of multiple shards in Kinesis, you have to both use different partition keys and exceed the capacity of the first shard.

    In my example, I am receiving incoming data from about 100 different servers, so I am going to attach something like server ID into the partition key so that I have a unique partition key for every server. Now, Lambda will concurrently process the events for all 100 servers.

    Partition key tells Kinesis/Lambda, "every record with this partition key is a unique queue of events; please keep these records in order by writing them to the same Kinesis shard and never running them concurrently in Lambda". If you want concurrent processing and multiple shards, then give a different partition key to each queue of events.

    The following answers, when synthesized together, provided the necessary information to answer my question:

    What is partition key in AWS Kinesis all about?

    Parallelization factor: AWS Kinesis data streams to Lambda