amazon-web-servicesaws-lambdaconcurrencyamazon-sqs

Lambda with SQSEvent & large batch size invokes multiple instances each handling few items


Bit of background, I'm using Serverless and .Net to create a lambda with a SQS trigger. The event trigger is set with batch size of 10k and wait time (Batch Window ie MaximumBatchingWindowInSeconds) of 30 seconds. Queue's visibility timeout is set to almost 16 minutes.

Now that I've set the lambda to reserved concurrency of only 1 and ran a test where I send 100 items to the queue & was hoping to see only one lambda invocation with exactly those 100 items.

Problem was that it separated the items in the queue and invoked the lambda five times instead, causing five packages to be created as part of the lambda's functionality instead of the one package I wanted. (FYI the lambda's output creates packages in s3 of the messages. I want to have fewer packages that are large.)

Now the question: Is this the expected behavior? and if so why is it so when I've set the queue to accumulate up to 10k items and instead it settled for 15.

According to the aws docs the lambda can grab fewer messages than the batchSize if the payload is larger than 256kb but my messages are very small and 100 messages are no where near 256kb. So that can't be the cause.

Suggestions for alternatives to dealing with this issue are also welcome, right now I'm thinking of running an event bridge scheduler that calls lambda with SQS ReceiveMessage api and creates a single package but then I also have to make sure to properly delete the queue afterwards.

I'm a bit clueless here, I'd appreciate any ideas you guys have. Thanks.


Solution

  • Problem was that it separated the items in the queue and invoked the lambda five times instead, causing five packages to be created as part of the lambda's functionality instead of the one package I wanted.

    I think that this is probably because there are five SQS polling threads that lambda uses to pool SQS. From AWS blog:

    Lambda service will begin polling the SQS queue using five parallel long-polling connections.

    So even though you had reserved concurrency of 1, lambda still uses the 5 threads (you can't control that), and your SQS messages were distributed into these threads. Then, each thread invoked your lambda function, one by one, resulting in observed 5 invocations.