amazon-web-servicesamazon-ec2microserviceselastic-load-balancerinternal-load-balancer

AWS Load Balancer 502


I have microservices(in different programming languages) running on an EC2 instance. On production I notice a few 502 Bad Gateway Errors when these services try to interact with each other. Also in the logs of the requested service it doesn't show any api call is being hit

example service A calls service B, but in service B logs there is nothing to indicate that a call came from service A.

Can it be AWS load balancer issue? Any help would be appreciated. Thanks in advance.

Solution tried: We tried making http/https connection agents in each service but still we get this issue.

Update: In lb logs, the api is logged, but the target response code shows "-" whereas lb response code shows 502 or 504. Does it mean that lb is not able to handle the traffic or my application?

Also what can be the possible solution?


Solution

  • We had the same problem.

    In our setup, an AWS Application ELB has a target group of 4 EC2 instances. On each of the EC2 instances, there is an Apache2 which forwards to a Tomcat.

    The ELB has a default connection KeepAlive of 60 seconds. Apache2 has a default connection KeepAlive of 5 seconds. If the 5 seconds are over, the Apache2 closes its connection and resets the connection with the ELB. However, if a request comes in at precisely the right time, the ELB will accept it, decide which host to forward it to, and in that moment, the Apache closes the connection. This will result in said 502 error code.

    The solution is: When you have cascading proxies/LBs, either align their KeepAlive timeouts, or - preferrably - even make them a little longer the further down the line you get.

    We set the ELB timeout to 60 seconds and the Apache2 timeout to 120 seconds. Problem gone.