.netapachedotnet-httpclientkeep-aliveservicepoint

Why do dotnet keepalive Http connections fail on the second request with "A connection that was expected to be kept alive was closed by the server."?


I have a dotnet framework application which performs POST api requests to a remote server running Apache. It intermittently fails with the error:

The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. 

This occurs on the second request to the server when done over a keepalive TLS connection, and so occurs more frequently in production systems under heavy load and less frequently or not at all in development environments.

We have tried:

Disabling HTTP keep-alive appears to work around the issue. (HttpWebRequest.KeepAlive = false)

Is there a way to solve this without disabling http keep-alive?


Solution

  • The Apache setting KeepAliveTimeout defaults to 5s of inactivity before an idle keep-alive connection will be closed. (https://httpd.apache.org/docs/2.4/mod/core.html#keepalivetimeout)

    This leads to a condition where:

    1. dotnet opens a connection to apache and issues a POST
    2. apache returns a 200 OK.
    3. the connection is "idle" waiting for another request.
    4. after 2s dotnet opens a new HttpWebRequest and calls GetRequestStream() on it ready to write the request. Since there is an idle connection in the pool, that connection is used.
    5. after 5s (KeepAliveTimeout), apache sends a FIN packet to close the underlying connection.
    6. after (say) 30s dotnet attempts to write to the stream, which attempts to use the now-defunct socket and immediately fails with The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.

    This is particularly a problem in large POST calls (say, calling a SOAP API) where forming the payload may take a nontrivial amount of time.

    Possible solutions are:

    1. Do not call HttpWebRequest.GetRequestStream() until immediately before starting to send data.
    2. Disable keep-alive (HttpWebRequest.KeepAlive = false). However that note if any other thread in your application is using keep-alive then the problem will occur (the two requests above can be in entirely different threads)
    3. The most robust solution appears to be to implementing application-level retry.

    Note importantly that this behaviour (of "locking a stream to a socket") only seems to occur in dotnet framework, not in dotnet 5/core.