We're using a lambda in all relevant behaviours (majority of the requests) used at viewer request (all with caching disabled). We've been checking how the lambda has performed during the Black Friday and we discovered that at some point the lambda was invoked about 34K times vs 9M of invocations of the CDN.
I checked throttled requests, but they seem to be 0 all the time.
So how does it really work? Does AWS optimise invocations, based on expected responses?
For example the lambda, for certain subpaths, just returns the original request.
Can AWS cache that stuff internally and save invocations?
TL;DR: You were right. AWS Can "cache that stuff internally and save invocations", especially for cases where you return the original request for the same subpaths.
FULL ANSWER
The difference you're spotting between Lambda invocations and CDN requests is probably down to how AWS CloudFront works with Lambda@Edge functions.
Even if you’re using a Lambda function at the viewer request stage with caching turned off, it doesn’t guarantee the function will run for every single request. AWS is clever about this and tends to optimise invocations based on what it reckons the response will be, especially with Lambda@Edge.
In your case, where the Lambda function just forwards the original request for certain subpaths, AWS can cache that behaviour behind the scenes. CloudFront is built to cut down on Lambda invocations if it figures out that the function’s output won’t change much for similar requests.
This kind of optimisation works particularly well for functions that either don’t tweak the request at all or only do so based on simple criteria like specific URL paths. CloudFront remembers the result of earlier invocations and applies the same logic without calling the Lambda function every single time.
The fact you’re not seeing any throttled requests backs this up. If CloudFront was invoking the Lambda for every request during busy times, like Black Friday, you’d almost definitely notice some throttling.
It’s worth mentioning that while this optimisation keeps things speedy and cuts costs, it doesn’t mess with your actual CDN content caching. If you’ve disabled caching for the content, that’s still how it’ll work.
This behaviour is baked into AWS and isn’t something you can really tweak, but it’s usually a good thing since it lowers latency and costs while keeping everything running as expected.
If you absolutely need the Lambda to trigger for every single request, no exceptions, you’d have to include something unique like a timestamp or ID in each request. Just be aware this could bump up your Lambda costs and might slow things down when traffic is heavy.