For reference, livy is a rest endpoint used to pull data from a cluster. Within the same account, my lambda function always times out when attempting to access the livy endpoint using by EMR.
Endpoint: http://ip-xxxx-xx-xxx-xx.ec2.internal:8998/
Error:
"errorMessage": "HTTPConnectionPool(host='ip-xxx-xx-xxxx-xx.ec2.internal', port=`8998`):
Max retries exceeded with url: /batches (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa6613150d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))",
I added AmazonEMRFullAccessPolicy_v2 to the lambda, but it did not help. What am I missing? Is this the right endpoint to use for internal access?
Firstly, your http endpoint is <ip>.ec2.internal
. So Im assuming its in some VPC (not public). So your lambda must be in within the same VPC.
You are trying to access HTTP endpoint with a TCP connection, so your lambda security group must be whitelisted to EMR master node's security group.
Lastly, You don't need AmazonEMRFullAccessPolicy_v2
because you are not accessing AWS API.