AWS documentation on below link asks to allow full outbound internet access on EMR master security group for the cluster which is in private subnet.
However full outbound access poses risks. What is the rationale behind this full outbound internet access?
Below is what I could gather after a connect with AWS support:
Outbound Rules on your Security Group is only applicable whenever your cluster nodes initiates new connections to external IP's (i.e., any IP's and not localhost/its own private IP). This is the reason why we provide unrestricted access to outbound connections as they are initiated by the node itself.
It is important to understand that when your cluster is launching it needs to have connectivity to S3 to download necessary repos,upload/download logs, cluster information etc. Moreover, application provisioning phase in EMR consists successful configuration of a lot of internal services/components (such as Resource Manager, NameNode, Node Manager, DataNode etc), all of them operate on different random ports within the cluster itself, so it is necessary to allow all the TCP communications between the master and slave/core node security groups. Also master Node communicates over SSL for the majority of instance controller communications and other Cluster Manager components to configure necessary software and also exchange heart-beat signals , and thus 443 and 80 ports needs to be opened.
In addition Hadoop talks to different application where each of them run on their own unique ports as well as different private IP address as Cluster adds or removes more nodes. So, we can not provide a list of specific ports that can be opened for cluster operations because the port and protocol requirements does vary depending on the applications that are configured on EMR cluster and the tasks on the EMR cluster might fail if nodes are not able to communicate with each other or any other external dependency on the desired ports which includes the ephemeral port range.
Therefore, please note that the recommended configuration for managed security group egress rules is 0.0.0.0/0 especially during the cluster launch as restricting it could make EMR unable to download the applications required and thus end in Cluster provision failure.
However, I understand that you are looking for minimum recommended settings to configure outbound rules on "Amazon EMR–Managed Security Groups" instead of 0.0.0.0/0(All traffic) as this may poses a security risk.
It is highly advisable to not make any changes during the cluster launch. Even after launching the cluster, it might create an issue if the outbound security group rules aren't configured properly. You may update the security group rules after the cluster has been launched successfully. But below are the few things we need to consider here before doing so -
Not advisable to make any changes during cluster launch:
Hence you can try to restrict the outbound rules depending upon your use case scenario once the cluster is successfully launched and it can look like -
Outbound rules for ElasticMapReduce-master configuration:
Type Protocol Port Range Destination
HTTP TCP 80 0.0.0.0/0
HTTPS TCP 443 0.0.0.0/0
AllTraffic TCP 0 - 65535 ElasticMapReduce-master security group ID
AllTraffic TCP 0 - 65535 ElasticMapReduce-slave security group ID
Outbound rules for ElasticMapReduce-slave configuration:
Type Protocol Port Range Destination
HTTP TCP 80 0.0.0.0/0
HTTPS TCP 443 0.0.0.0/0
AllTraffic TCP 0 - 65535 ElasticMapReduce-master security group ID
AllTraffic TCP 0 - 65535 ElasticMapReduce-slave security group ID
Note: AllTraffic includes All TCP, UDP, ICMP v4 to slave node security group and master node security group. For any other application specific requirement, you may add any other port as per the requirement.
Also please note that pinpointing exactly which routes are required in an EMR cluster is a very difficult process because there are so many moving parts which is why it's not recommended. We cannot outline exact what are the rules you require for your specific cluster because every cluster is different depending on the applications and integrations used. If you absolutely require to do this, you'll need to enable VPC flow logs on all ENIs in your EMR subnet and go through them using CloudWatch Logs Insights or Athena (if you're pushing to S3).
I would strongly recommend to please test changing the security group configurations in the development environment first before doing it in production.