javaapache-httpclient-4.x

Apache HttpClient Keep-Alive Strategy for active connections


In an Apache HttpClient with a PoolingHttpClientConnectionManager, does the Keep-Alive strategy change the amount of time that an active connection will stay alive until it will be removed from the connection pool? Or will it only close out idle connections?

For example, if I set my Keep-Alive strategy to return 5 seconds for every request, and I use the same connection to hit a single URL/route once every 2 seconds, will my keep-alive strategy cause this connection to leave the pool? Or will it stay in the pool, because the connection is not idle?


Solution

  • I just tested this and confirmed that the Keep-Alive strategy will only remove idle connections from the HttpClient's connection pool after the Keep-Alive duration has passed. The Keep-Alive duration determines whether or not the connection is idle, in fact - if the Keep-alive strategy says to keep connections alive for 10 seconds, and we receive responses from the server every 2 seconds, the connection will be kept alive for 10 seconds after the last successful response.

    The test that I ran was as follows:

    1. I set up an Apache HttpClient (using a PoolingHttpClientConnectionManager) with the following ConnectionKeepAliveStrategy:

          return (httpResponse, httpContext) -> {
              // Honor 'keep-alive' header
              HeaderElementIterator it = new BasicHeaderElementIterator(
                      httpResponse.headerIterator(HTTP.CONN_KEEP_ALIVE));
              while (it.hasNext()) {
                  HeaderElement he = it.nextElement();
                  String param = he.getName();
                  String value = he.getValue();
                  if (value != null && param.equalsIgnoreCase("timeout")) {
                      try {
                          return Long.parseLong(value) * 1000;
                      } catch(NumberFormatException ignore) {
                      }
                  }
              }
              if (keepAliveDuration <= 0) {
                  return -1; // the connection will stay alive indefinitely.
              }
              return keepAliveDuration * 1000;
          };
      }
      
    2. I created an endpoint on my application which used the HttpClient to make a GET request to a URL behind a DNS.

    3. I wrote a program to hit that endpoint every 1 second.

    4. I changed my local DNS for the address that the HttpClient was sending GET requests to to point to a dummy URL that would not respond to requests. (This was done by changing my /etc/hosts file).

    When I had set the keepAliveDuration to -1 seconds, even after changing the DNS to point to the dummy URL, the HttpClient would continuously send requests to the old IP address, despite the DNS change. I kept this test running for 1 hour and it continued to send requests to the old IP address associated with the stale DNS. This would happen indefinitely, as my ConnectionKeepAliveStrategy had been configured to keep the connection to the old URL alive indefinitely.

    When I had set the keepAliveDuration to 10, after I had changed my DNS, I sent successful requests continuously, for about an hour. It wasn't until I turned off my load test and waited 10 seconds until we received a new connection. This means that the ConnectionKeepAliveStrategy removed the connection from the HttpClient's connection pool 10 seconds after the last successful response from the server.

    Conclusion

    By default, if an HttpClient does not receive a Keep-Alive header from a response it gets from a server, it assumes its connection to that server can be kept alive indefinitely, and will keep that connection in it's PoolingHttpClientConnectionManager indefinitely.

    If you set a ConnectionKeepAliveStrategy like I did, then it will add a Keep-Alive header to the response from the server. Having a Keep-Alive header on the HttpClient response will cause the connection to leave the connection pool after the Keep-Alive duration has passed, after the last successful response from the server. This means that only idle connections are affected by the Keep-Alive duration, and "idle connections" are connections that haven't been used since the Keep-Alive duration has passed.