gogo-httpresty

Go HTTP idle connection pool and http trace


I have a long running GO program(version 1.18) which sends hundreds of GET requests simultaneously per second using RESTY to the remote https://api.abcd.com. Each GET request is a separate go-routine which uses the same RESTY client.

remote server https://api.abcd.com is nginx/1.19.2(HTTP/2), IP address is 11.11.11.11 and 22.22.22.22. Yes, this remoter server has multiple IP addresses.

I use hostname when setting RESTY client SetBaseURL("https://api.abcd.com")

Transport configuration are default one in RESTY.

TraceInfo() is enabled on RESTY client side. There is a "IsConnReused" field in the trace info. This IsConnReused actually comes from struct GotConnInfo in GO httptrace package:

type GotConnInfo struct {
    Conn net.Conn

    // Reused is whether this connection has been previously used for another HTTP request.
    Reused bool

    // WasIdle is whether this connection was obtained from anidle pool.
    WasIdle bool

    // IdleTime reports how long the connection was previously idle, if WasIdle is true.
    IdleTime time.Duration
}

question 1: GO httptrace determine "Connection reused" based on hostname(api.abcd.com) or IP address?

question 2: GO http package idle connection pool is actually a map, key is a struct type connectMethodKey. The addr field in this struct is hostname or IP address?

type connectMethodKey struct {
    proxy, scheme, addr string
    onlyH1              bool
}

This is what I found in TraceInfo(). When the program runs at the beginning, all requests are sent to 11.11.11.11:443. Few minutes later, all requests are sent to 22.22.22.22, no 11.11.11.11 anymore. Then, few minutes later, all requests start to sent to 11.11.11.11 again, no 22.22.22.22 this time.

question 3: when requests start to sent to 22.22.22.22, the socket connections to 11.11.11.11 are idle, why GO http does not use idle connections anymore? I don't think those idle connection has already timeout.


Solution

  • question 1: GO httptrace determine "Connection reused" based on hostname(api.abcd.com) or IP address?

    httptrace.GotConnInfo.Reused tracks if TCP connection was used for another HTTP request. It is per IP address.

    question 2: GO http package idle connection pool is actually a map, key is a struct type connectMethodKey. The addr field in this struct is hostname or IP address?

    addr is hostname

    Could be an IP though if you send request to something like http://127.0.0.1/.

    question 3: when requests start to sent to 22.22.22.22, the socket connections to 11.11.11.11 are idle, why GO http does not use idle connections anymore? I don't think those idle connection has already timeout.

    It could work differently if you use HTTP 1. With it, every request requires the own TCP connection. Subsequent requests may reuse TCP connection, but if you want to run requests in parallel, you need to establish multiple TCP connections. Every connection would use a different IP address, and you would see traffic evenly distributed.

    With HTTP/2 a single TCP connection can be used for multiple parallel requests. That connection uses a single IP address.

    This is how GO calculates if a new request can use open connection:

    https://cs.opensource.google/go/x/net/+/69896b71:http2/transport.go;l=881;drc=69896b714898bee1e3403560cd2e1870bcc8bd35;bpv=1;bpt=1

    Play with these prams to distribute the traffic across multiple TCP connections.