amazon-web-servicesamazon-s3aws-lambdaamazon-vpc

Adding AWS Lambda with VPC configuration causes timeout when accessing S3


I am trying to access S3 and resources on my VPC from AWS Lambda but since I configured my AWS Lambda to access VPC it's timing out when accessing S3. Here's the code

from __future__ import print_function

import boto3
import logging
import json

print('Loading function')

s3 = boto3.resource('s3')

import urllib

def lambda_handler(event, context):
    logging.getLogger().setLevel(logging.INFO)
    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key']).decode('utf8')
    print('Processing object {} from bucket {}. '.format(key, bucket))
    try:
        response = s3.Object(bucket, key)
        content = json.loads(response.get()['Body'].read())
        # with table.batch_writer() as batch:
        for c in content:
            print('     Processing Item : ID' + str(c['id']))
            # ##################
            # Do custom processing here using VPC resources
            # ##################
    except Exception as e:
        print('Error while processing object {} from bucket {}. '.format(key, bucket))
        print(e)
        raise e

I've set my subnets and security groups with appropriate Outbound rules to access internet as shown below but my Lambda simply times out when accessing S3.

enter image description here

enter image description here

Here's a sample of test input as well

# Test Event Configuration
{
  "Records": [
    {
      "awsRegion": "us-east-1",
      "eventName": "ObjectCreated:Put",
      "eventSource": "aws:s3",
      "eventTime": "2016-02-11T19:11:46.058Z",
      "eventVersion": "2.0",
      "requestParameters": {
        "sourceIPAddress": "54.88.229.196"
      },
      "responseElements": {
        "x-amz-id-2": "ljEg+Y/InHDO8xA9c+iz6DTKKenmTaGE9UzHOAabarRmpDF1z0eUJBdpGi37Z2BU9nbTh4p7oZg=",
        "x-amz-request-id": "3D98A2325EC127C6"
      },
      "s3": {
        "bucket": {
          "arn": "arn:aws:s3:::social-gauge-data",
          "name": "social-gauge-data",
          "ownerIdentity": {
            "principalId": "A1NCXDU7DLYS07"
          }
        },
        "configurationId": "b5540417-a0ac-4ed0-9619-8f27ba949694",
        "object": {
          "eTag": "9c5116c70e8b3628380299e39e0e9d33",
          "key": "posts/test/testdata",
          "sequencer": "0056BCDCF1F544BD71",
          "size": 72120
        },
        "s3SchemaVersion": "1.0"
      },
      "userIdentity": {
        "principalId": "AWS:AROAIUFL6WAMNRLUBLL3K:AWSFirehoseDelivery"
      }
    }
  ]
}

Solution

  • Once you enable VPC support in Lambda your function no longer has access to anything outside your VPC, which includes S3. With S3 specifically you can use VPC Endpoints to solve this. For pretty much anything else outside your VPC, you would need to create a NAT instance or a managed NAT gateway in your VPC to route traffic from your Lambda functions to endpoints outside of your VPC.

    I would read the Lambda VPC support announcement, and pay special attention to the "Things to Know" section at the end.