amazon-web-servicesdockeramazon-ecsaws-fargate

`aws ecs execute-command` results in `TargetNotConnectedException` `The execute command failed due to an internal error`


I am running a Docker image on an ECS cluster to shell into it and run some simple tests. However when I run this:

aws ecs execute-command  \
  --cluster MyEcsCluster \
  --task $ECS_TASK_ARN \
  --container MainContainer \
  --command "/bin/bash" \
  --interactive

I get the error:

The Session Manager plugin was installed successfully. Use the AWS CLI to start a session.


An error occurred (TargetNotConnectedException) when calling the ExecuteCommand operation: The execute command failed due to an internal error. Try again later.

I can confirm the task + container + agent are all running:

aws ecs describe-tasks \
  --cluster MyEcsCluster \
  --tasks $ECS_TASK_ARN \
  | jq '.'
      "containers": [
        {
          "containerArn": "<redacted>",
          "taskArn": "<redacted>",
          "name": "MainContainer",
          "image": "confluentinc/cp-kafkacat",
          "runtimeId": "<redacted>",
          "lastStatus": "RUNNING",
          "networkBindings": [],
          "networkInterfaces": [
            {
              "attachmentId": "<redacted>",
              "privateIpv4Address": "<redacted>"
            }
          ],
          "healthStatus": "UNKNOWN",
          "managedAgents": [
            {
              "lastStartedAt": "2021-09-20T16:26:44.540000-05:00",
              "name": "ExecuteCommandAgent",
              "lastStatus": "RUNNING"
            }
          ],
          "cpu": "0",
          "memory": "4096"
        }
      ],

I'm defining the ECS Cluster and Task Definition with the CDK Typescript code:

    new Cluster(stack, `MyEcsCluster`, {
        vpc,
        clusterName: `MyEcsCluster`,
    })

    const taskDefinition = new FargateTaskDefinition(stack, TestTaskDefinition`, {
        family: `TestTaskDefinition`,
        cpu: 512,
        memoryLimitMiB: 4096,
    })
    taskDefinition.addContainer("MainContainer", {
        image: ContainerImage.fromRegistry("confluentinc/cp-kafkacat"),
        command: ["tail", "-F", "/dev/null"],
        memoryLimitMiB: 4096,
        // Some internet searches suggested setting this flag. This didn't seem to help.
        readonlyRootFilesystem: false,
    })

Solution

  • ECS Exec Checker should be able to figure out what's wrong with your setup. Can you give it a try?

    The check-ecs-exec.sh script allows you to check and validate both your CLI environment and ECS cluster/task are ready for ECS Exec, by calling various AWS APIs on behalf of you.