pythonamazon-web-servicesamazon-s3aws-lambdapython-3.8

AWS Lambda Function Not Producing Expected Result


I am working on an AWS Lambda function for an end-to-end data engineering project involving YouTube data analysis. The function is designed to read JSON data from an S3 bucket, process it, and write the results back to another S3 bucket using AWS Glue.

I have set up the environment variables, and the S3 buckets are created. However, when I test the Lambda function using the S3-put option, the execution result is not as expected. The response I receive is:

{
  "statusCode": 200,
  "body": "\"Hello from Lambda!\""
}

This is not the expected result, and I suspect there might be an issue with my Lambda function. I have verified the environment variables, IAM permissions, and the input event JSON. Could someone please review my Lambda function code and help me identify the issue?

import awswrangler as wr
import pandas as pd
import urllib.parse
import os

os_input_s3_cleansed_layer = os.environ['s3_cleansed_layer']
os_input_glue_catalog_db_name = os.environ['glue_catalog_db_name']
os_input_glue_catalog_table_name = os.environ['glue_catalog_table_name']
os_input_write_data_operation = os.environ['write_data_operation']


def lambda_handler(event, context):
    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
    try:

        # Creating DF from content
        df_raw = wr.s3.read_json('s3://{}/{}'.format(bucket, key))

        # Extract required columns:
        df_step_1 = pd.json_normalize(df_raw['items'])

        # Write to S3
        wr_response = wr.s3.to_parquet(
            df=df_step_1,
            path=os_input_s3_cleansed_layer,
            dataset=True,
            database=os_input_glue_catalog_db_name,
            table=os_input_glue_catalog_table_name,
            mode=os_input_write_data_operation
        )

        return wr_response
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

I have also updated the except block to provide more detailed error messages, and the logs show

Function Logs
START RequestId: 0589a447-4f50-4dc9-b58e-9d0c7ff1b2de Version: $LATEST
END RequestId: 0589a447-4f50-4dc9-b58e-9d0c7ff1b2de
REPORT RequestId: 0589a447-4f50-4dc9-b58e-9d0c7ff1b2de  Duration: 1.24 ms   Billed Duration: 2 ms   Memory Size: 128 MB Max Memory Used: 39 MB

Solution

  • If you are using the AWS Lambda console to write/edit your Lambda function code, then you need to deploy any changes that you have made before you can run them.

    When there are undeployed changes, the Deploy button will be available to press.

    enter image description here