How to improve the eviction policy in the Eureka Service Registry

How to improve the eviction policy in the Eureka Service Registry

If you use Ribbon and Eureka in your Spring Boot application, you’ll notice that the default configuration is not optimal. Eureka takes a long time to figure out that the service went down unexpectedly and, in the meantime, your load balancer Ribbon will try to connect to the dead one. In other words, the eviction policy does not perform well by default.

On the other hand, the official Eureka documentation discourages changing the leaseRenewalIntervalInSeconds parameter so, what can we do here? This post answers that question.

Table of Contents

Netflix Eureka does not deregister instances

Eureka is a great tool for Service Discovery and integrates very well with Spring Boot. We can create a Service Registry just by adding some dependencies and an annotation, and we can connect our clients to register on the server with minimal configuration too. It comes with Ribbon as Load Balancer, so we can get everything working in few minutes. Our services will ask the registry for the instances of a given service and will decide by themselves, using Ribbon, to which one they’re connecting.

The problem arises when one of our multiple instances of a service goes down unexpectedly or loses connectivity, not having time to notify the Eureka server. The service registry is based on leases, and every client should renew itself every N seconds. When the lease expires, Eureka takes also some time to decide that the instance is no longer valid. It’s not a very straightforward mechanism, as you’ll see explained on different Internet threads. With the default configuration, my experience is that it can take up to four or five minutes to deregister a dead instance.

On the other hand, the official documentation tells us that we shouldn’t change this configuration. My guess is that, since the mechanism to deregister is not easy to understand, you can mess the entire configuration up if you make mistakes.

How to solve it

The approach to solving this problem is based on Ribbon and not Eureka. We can leave the Service Registry alone and let it work as usual, with the optimal configuration, but we can make our clients smarter and let them find out which services are really healthy.

Spring Boot configures Ribbon by default with a Round-Robin strategy for load balancing. It also sets the status-check mechanism to none (NoOpPing), which means that the load balancer will not verify if the services are still alive. It makes sense since it should be our Service Registry, Eureka, the one that registers and deregisters instances. But we just concluded that the time it will take it’s not good for us.

Adding Ribbon Configuration

We can use the Ribbon functionality to ping services from the Service Registry and apply load balancing depending on the result. To get that working, we need to configure two Spring beans: an IPing to establish the check-status mechanism and an IRule to change the default load balancing strategy.

public class RibbonConfiguration {

    @Bean
    public IPing ribbonPing(final IClientConfig config) {
        return new PingUrl(false,"/health");
    }

    @Bean
    public IRule ribbonRule(final IClientConfig config) {
        return new AvailabilityFilteringRule();
    }

}
  • The PingUrl implementation checks if services are alive. We want to change the default URL and point it to /health since we want to avoid requests to unmapped root contexts. The false flag is just to indicate that the endpoint is not secured.
  • The AvailabilityFilteringRule is an alternative to the default RoundRobinRule that takes into account the availability being checked by our new pings.
  • One thing that's very important to note (since it's tricky) is that this class is not annotated with @Configuration. It's injected in a different way: we need to reference it from a new annotation added to the main application class: @RibbonClients.

Application’s main class

@EnableZuulProxy
@EnableEurekaClient
@RibbonClients(defaultConfiguration = RibbonConfiguration.class)
@SpringBootApplication
public class GatewayApplication {

    public static void main(String[] args) {
        SpringApplication.run(GatewayApplication.class, args);
    }
}

If we test now again the scenario where multiple instances are registered, and then one of them goes down, we’ll notice that the reaction time to find out an unavailable service is much less.

You can find some other options for load balancing strategy on the official repository. There are implementations that allow us to balance load depending on response time, geographical affinity, etc. The best idea is to design your plan, test it (with some requests to load your system) and then adjust it based on the results.

Code samples

These code samples are taken from one of the versions of the project that is developed along the book Learn Microservices with Spring Boot.

Microservices: Logical View
Microservices: Logical View

You can find all the source code on GitHub: Microservices V8 - API Gateway

In that example, the client is actually a Zuul gateway. It acts as a load balancer for the services registered in Eureka. You can run the entire system and then put into practice the different strategies. You can also remove the line with the @RibbonClients annotation, and then you’ll reproduce the error. There will be a time in which your routing will be failing because of Eureka’s default way of deregistering instances.

Get the book Practical Software Architecture

I hope you find it useful! Please let me know any question or suggestion you may have by using the comments.

My book can help you understand all these Microservices Patterns: Service Discovery, Load Balancing, and Routing. Check it out on Amazon.
Moisés Macero's Picture

About Moisés Macero

Software Developer, Architect, and Author.
Do you need help?

Amsterdam, The Netherlands https://thepracticaldeveloper.com

Comments