amazon-web-servicescontainersamazon-ecsaws-alb

Problem with setting a service in AWS ECS


I was trying to set up a ECS service running a container image on a cluster, but could not get the setup working.

I have basically followed the guide on https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-blue-green.html, except that I was trying to host the containers on EC2 instances.

I wonder if the issue is related to the network mode (used "awsvpc").

Expectation

It should show something on index.html on access witht eh ALB link

Observation

When I tried to access with the load balancer link, it gives HTTP 503, and the health-check also showed unhealthy

ALB_Link_HTTP_503

And it seems ECS keeps "re-creating" the conatiners? (Forgive me as I am still not familiar with ECS)

Containers_keep_re-creating

Tried to access the container instance directly but also could not reach

Container_instance_link

Conatiner_instance_could_not_reach

I had a look on the ECS agent log (/var/logs/ecs-agent.log) on the container instance, the image should have been pulled sucessfully

Image_pulled_successfully

And the task should have been started

enter image description here

ECS service events

It seems it kept register and deregister target

ECS_service_events

Security groups have been set to accept HTTP traffic

Setup

Tomcat server on container starts on port 80

enter image description here

enter image description here

enter image description here

ECS task definition creation

{
"family": "TestTaskDefinition",
"networkMode": "awsvpc",
"containerDefinitions": [
    {
        "name": "TestContainer",
        "image": "<Image URI>",
        "portMappings": [
            {
                "containerPort": 80,
                "hostPort": 80,
                "protocol": "tcp"
            }
        ],
        "essential": true
    }
],
"requiresCompatibilities": [
    "EC2"
],
"cpu": "256",
"memory": "512",
"executionRoleArn": "<ECS execution role ARN>"
}

ECS service creation

{
"cluster": "TestCluster",
"serviceName": "TestService",
"taskDefinition": "TestTaskDefinition",
"loadBalancers": [
    {
        "targetGroupArn": "<target group ARN>",
        "containerName": "TestContainer",
        "containerPort": 80
    }
],
"launchType": "EC2",
"schedulingStrategy": "REPLICA",
"deploymentController": {
    "type": "CODE_DEPLOY"
},
"networkConfiguration": {
   "awsvpcConfiguration": {
      "assignPublicIp": "DISABLED",
      "securityGroups": [ "sg-0f9b629686ca3bd08" ],
      "subnets": [ "subnet-05f47b367df4f50d4", "subnet-0fd76fc8e47ea3be7" ]
   }
},
"desiredCount": 1
}

Solution

  • Based on the comments.

    To investigate the issue, it was recommended to tested the ECS service without ALB. Based on the test, it was found that the ALB was treating the ECS service as unhealthy due to long application starting time.

    The issue was solved by increasing ALB health-check grace period to (e.g. 300s).

    not sure if EC2 launch type must use "bridge"

    You can use awsvpc on EC2 instances as well, but bridge is easier to use in this case.