amazon-web-serviceslambdaamazon-emrlivy

AWS - Lambda cannot access Livy endpoint for EMR


For reference, livy is a rest endpoint used to pull data from a cluster. Within the same account, my lambda function always times out when attempting to access the livy endpoint using by EMR.

Endpoint: http://ip-xxxx-xx-xxx-xx.ec2.internal:8998/

Error:

     "errorMessage": "HTTPConnectionPool(host='ip-xxx-xx-xxxx-xx.ec2.internal', port=`8998`):
Max retries exceeded with url: /batches (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa6613150d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))",

I added AmazonEMRFullAccessPolicy_v2 to the lambda, but it did not help. What am I missing? Is this the right endpoint to use for internal access?


Solution

  • Firstly, your http endpoint is <ip>.ec2.internal. So Im assuming its in some VPC (not public). So your lambda must be in within the same VPC.

    You are trying to access HTTP endpoint with a TCP connection, so your lambda security group must be whitelisted to EMR master node's security group.

    Lastly, You don't need AmazonEMRFullAccessPolicy_v2 because you are not accessing AWS API.