Kubernetes/EKS rolling update causes downtime

We have the following configuration for our service that's deployed to EKS but it causes downtime for about 120s whenever we make a deployment.

I can successfully make requests to the new pod when I port forward to it directly, so the pod itself seems fine. It seems to be either the AWS NLB that's not routing the traffic or something network related but I'm not sure, and I don't know where to debug further for this.

I tried a few things to no avail: added a readinessProbe, tried increasing the initialDelaySeconds to 120, tried switching to an IP ELB target, rather than an instance ELB target type, tried reducing the NLB's health check interval but it's not actually being applied and remains as 30s.

Any help would be greatly appreciated!

---
# Autoscaler for the frontend

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-frontend
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-frontend
  minReplicas: 3
  maxReplicas: 8
  targetCPUUtilizationPercentage: 60

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-frontend
  labels:
    app: my-frontend
spec:
  replicas: 3
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  selector:
    matchLabels:
      app: my-frontend
  template:
    metadata:
      labels:
        app: my-frontend
    spec:
      containers:
        - name: my-frontend
          image: ${DOCKER_IMAGE}
          ports:
            - containerPort: 3001
              name: web
          resources:
            requests:
              cpu: "300m"
              memory: "256Mi"
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /v1/ping
              port: 3001
            initialDelaySeconds: 5
            timeoutSeconds: 1
            periodSeconds: 10
          readinessProbe:
            httpGet:
              scheme: HTTP
              path: /v1/ping
              port: 3001
            initialDelaySeconds: 5
            timeoutSeconds: 1
            periodSeconds: 10
      restartPolicy: Always

---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: ${SSL_CERTIFICATE_ARN}
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "https"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: "60"
  name: my-frontend
  labels:
    service: my-frontend
spec:
  ports:
    - name: http
      port: 80
      targetPort: 3001
    - name: https
      port: 443
      targetPort: 3001
  externalTrafficPolicy: Local
  selector:
    app: my-frontend
  type: LoadBalancer

Solution

This is most likely caused by NLB not reacting quickly enough to the target changes which is related directly to your externalTrafficPolicy settings.

If your application does not make any use of client IP you can set the externalTrafficPolicy to ClusterIP or leave it to default by removing it.

In case where your application requires to preserve client IP you may use the solution discussed in this github issue which in short requires you to use blue-green deployment.