amazon-web-servicesamazon-ec2amazon-ecs

Cannot reach ECS instance from different ECS instance


I have two ECS clusters, lets call them A and B. All services in these clusters have public endpoints accessible via an ALB. I am able to hit services in cluster A from the internet with no issues. However, I am getting 504 timeouts whenever I attempt to reach services in cluster A from cluster B.

I have verified that all services in cluster A have no inbound rules that would block traffic. All services in both clusters are in the same VPC.

If I ssh into one of the instances of cluster B I am able to hit services in cluster A using curl.

However, if I open a shell into a cluster B service and make the same request via curl I get a 504 again.

The endpoint I am trying to hit uses https (443) which should be allowed/enabled in the configuration of the ecs task.

Additionally, if I run the container locally I am able to curl services in cluster A.


Solution

  • If I'm understanding your question correctly, you can curl successfully from the EC2 host, but not from within the ECS task running on the EC2 host? If that's the case, then I'm guessing you are using awsvpc network mode for the ECS tasks.

    Please note that when you use awsvpc network mode, each ECS task gets a separate Elastic Network Interface (ENI) that provides a direct connection to the VPC's network without going through the EC2 host's network connection. This means that even if the EC2 instance has a public IP assigned and is able to access the Internet, the ECS tasks running on the instance will not be able to access the Internet without also having separate public IPs assigned to them as well.

    By default ECS tasks do not get public IPs assigned to them, even if you have the "assign public IP" setting enabled on your VPC. You have to enable the "assign public IP" setting in the ECS task definition to have a public IP assigned to them.


    Note that since Service A is a public service, with a DNS name that resolves to a public IP outside of the VPC's CIDR block, any request from Service B to Service A will exit the VPC to access Service A's public endpoint. This means that even though they are in the same VPC, they do not have a direct connection to each other. You will have to configure Service B to have Internet access (via either public IP assignment, or a NAT Gateway) in order for it to access Service A.


    Instead of assigning public IPs to service B's tasks, you could alternatively change the network mode of Service B's ECS tasks to one of the settings that utilizes the EC2 host's network, which should then be able to access Service A.

    Another alternative would be to add a private VPC-only endpoint to Service A, by adding a private load balancer to Service A in addition to the public load balancer it currently has, and then configuring Service B to connect to the private load balancer's endpoint, instead of the public endpoint of Service A.