jsonamazon-web-servicesweb-servicesaws-lambda

How to have consistent access to AWS lambda payload across various invocation methods


Depending on the AWS docs that I read, I see different methods to access the payload of AWS Lambda functions, leading to different and incompatible implementations.

Typically, according to AWS documentations, and practical experiments, with the example of a lambda calculating an area based of a length and a width, we would need different implementations to support these two invocations:

aws lambda invoke  --function-name $MY_LAMBDA_NAME \
    --payload '{"length": 10, "width": 10}' \
    --cli-binary-format raw-in-base64-out /dev/stdout

And:

curl $MY_LAMBDA_URL -H 'Content-Type: application/json' \
    -d '{"length": 10, "width": 10}'

For instance, in Getting started with Lambda, the event is a JSON document that contains the payload. The documentation specifies that "If your function is invoked by another AWS service, the event object contains information about the event that caused the invocation". The example implementation provided works as documented when invoking the lambda through the console, aws cli, etc. It doesn't work when invoking the lambda through curl, postman, etc. Accessing length and width looks like this:

def lambda_handler(event, context):
    length = event["length"]
    width = event["width"]
    ...

In other AWS documentation pages, the payload is a string accessible through the event body. One such example is this Tutorial: Creating a Lambda function with a function URL. The example implementation provided works when invoking the lambda through curl, postman, etc. It will however fail when invoking the lambda through the console, aws cli, etc. In this case, accessing length and width looks like this (conceptually converting from node to python):

def lambda_handler(event, context):
    # requires more work to check for encoding, etc.
    body = json.loads(event["body"])
    length = body["length"]
    width = body["width"]
    ...

Besides doing nothing, there are two obvious ways that I can think of to address that discrepancy:

  1. document the lambda to tell the user if the lambda is intended to be used via other AWS services or not, and let those who miss that part of the doc figure out the reason why specific keys aren't available in the event and why the behavior varies with the context
  2. add extra code to support both use cases, based on the content of the event (basically checking if it contains a body or not, supporting the various encodings, etc.)

The first option is unsatisfactory because it will cause unnecessary maintenance headaches. The second option is clunky and might cause a WTF moment for developers who haven't read that part of the doc.

What are the reasonable options to address this discrepancy? Are there parts of the AWS documentation that provide good recommendations? Are there ways to configure AWS lambdas in order to remove the discrepancy? Are there ways to invoke the lambda that lead to consistent behaviors whether it's invoked through another AWS service or not?


Solution

  • AWS Lambda functions are typically written with explicit knowledge of the data that is being passed to them via the event. It is rare that you would write a function that receives different types of input.

    This is just like a normal 'function' you would have in a Python program -- functions are written written to take specific inputs and perform specific actions. It is rare that you'd write a function in Python that has to figure out what inputs it has been given and what it is meant to do with those inputs.

    In your examples, if the function is meant to calculate the area of a rectangle, then it would be expecting length and width as input. It would not know how to handle an Event generates by Amazon S3 when a new object is created, and nor should it -- the purpose of the function is to calculate the area of a rectangle, not to respond to the creation of new objects.

    Just like a function definition in any programming language, knowing the inputs is very important to be able to write the function.

    If you are confused about what is being passed to the function, you can add a print(event) at the start of the lambda_handler() function. This will output the contents of the event into the log. You can use CloudWatch Events to view the log and see what was passed to the function.

    You might also be interested in Announcing AWS Lambda Function URLs: Built-in HTTPS Endpoints for Single-Function Microservices | AWS News Blog. This gives a way of invoking a Lambda function from the Internet without using API Gateway.