springspring-bootspring-cloudnetflix-eurekaspring-boot-admin

Services sometime register to eureka as localhost. Eureka replication


Introduction

So in general everything is working fine. But I notice that sometimes after a restart some of our services register as localhost.

This makes spring boot admin go crazy and it starts spamming us that the services are down.

So we start receiving the following e-mails:

SEQUENCE-SERVICE (52c98f2235a2) is OFFLINE 
Instance 52c98f2235a2 changed status from OFFLINE to OFFLINE 
Status Details
exception
io.netty.channel.AbstractChannel$AnnotatedConnectException
message
Connection refused: no further information: localhost/127.0.0.1:8007
Registration
Service Url http://localhost:8007/ 

Health Url  http://localhost:8007/sequence-service/v1/actuator/health 

Management Url  http://localhost:8007/actuator 

Infrastructure

We have three servers. So we have two servers that are running one eureka each, and also they are running microservices.

We have a third server that is running metrics, and spring boot admin.

Out Eureka config is basically:

Eureka-0

server.port=7995
eureka.instance.hostname=prod
eureka.instance.appname= discovery-service
eureka.client.registerWithEureka=true
eureka.client.fetchRegistry=true
eureka.client.serviceUrl.defaultZone=http://admin:admin@prod1:7995/eureka/


spring.application.name = discovery-service
security.user.name=admin
security.user.password=admin
spring.security.user.name=admin
spring.security.user.password=admin

endpoints.health.sensitive=false
management.endpoints.web.exposure.include=info, health


spring.profiles.active=prod,mbakTest

eureka.instance.metadata-map.user.name=${security.user.name}
eureka.instance.metadata-map.user.password=${security.user.password}

Eureka-1

server.port=7995
eureka.instance.hostname=prod1
eureka.instance.appname= discovery-service
eureka.client.registerWithEureka=true
eureka.client.fetchRegistry=true
eureka.client.serviceUrl.defaultZone=http://admin:admin@prod:7995/eureka/


spring.application.name = discovery-service
security.user.name=admin
security.user.password=admin
spring.security.user.name=admin
spring.security.user.password=admin

eureka.instance.metadata-map.user.name=${security.user.name}
eureka.instance.metadata-map.user.password=${security.user.password}

Sequence-Service

#Eureka configuration
eureka.client.enabled=true
eureka.client.healthcheck.enabled=true
eureka.client.registerWithEureka=false
eureka.client.fetchRegistry=true
eureka.instance.leaseRenewalIntervalInSeconds=15
eureka.instance.leaseExpirationDurationInSeconds=30
    eureka.client.serviceUrl.defaultZone=${EUREKA_SERVICE_URL:http://admin:admin@prod:7995/eureka/,http://admin:admin@prod1:7995}/eureka/

Questions

So I have two questions

1) So what I don't understand is why sometimes everything is fine and other times, we get e-mails from spring-boot-admin telling us a service is down. But the service is not down - it has registered with localhost. When we restart it, everything is fine. This usually happens after a restart.

2) Is this configuration correct and robust? My thinking is that if one eureka or server goes down, the other will take it's place.


Solution

  • Spring Cloud relies on a class called InetUtils to determine the hostname. It does so by checking the hostname of the first/highest priority non-loopback network interface.

    If no non-loopback network interface could be found, then it uses the hostname configured in spring.cloud.inetutils.default-hostname. The default value of this property is localhost.

    If a non-loopback interface could be found, but it takes longer than 1 second to determine the hostname, it will fallback to localhost. Beware, in this case you can't override the hostname as it uses a hardcoded localhost value. You can change the timeout of 1 second though, by configuring the spring.cloud.inetutils.timeout-seconds property.

    In our case, the latter was the problem. We solved it by increasing the timeout to 5 seconds:

    spring.cloud.inetutils.timeout-seconds=5
    

    To find out the culprit, you can enable trace logging for the InetUtils class:

    logging.level.org.springframework.cloud.commons.util.InetUtils=trace