socketsnginxwebsocketgetsockoptdragonfly-bsd

NGINX : Exceeds 65535 connections limit


Unlike HTTP, websocket keeps a long-live connection after it is upgraded from HTTP.

Even if the OS is tuned to use all ports, still there are only 65536 ports in total. Is it possible for NGINX to exceed this limit?

A potential solution is SO_REUSEPORT, but it is lacking document -- at least I don't find except this following paragraph

NGINX release 1.9.1 introduces a new feature that enables use of the SO_REUSEPORT socket option, which is available in newer versions of many operating systems, including DragonFly BSD and Linux (kernel version 3.9 and later). This socket option allows multiple sockets to listen on the same IP address and port combination. The kernel then load balances incoming connections across the sockets.

So, NGINX calls accept to accept an inbound connection.

The accept() system call is used with connection-based socket types (SOCK_STREAM, SOCK_SEQPACKET). It extracts the first connection request on the queue of pending connections for the listening socket, sockfd, creates a new connected socket, and returns a new file descriptor referring to that socket. The newly created socket is not in the listening state. The original socket sockfd is unaffected by this call.

Will the new socket consume port? If yes, how to exceeds 65535 connections limit?


Solution

  • The comment you've received is correct:

    TCP connections are defined by the 4-tuple (src_addr, src_port, dst_addr, dst_port). You can have a server connected to more than 65536 clients all on the same port if the clients are using different IP addresses and/or source ports. Example: server IP is 0.0.0.1 listening on port 80. All the 4-tuples could then be (*, *, 0.0.0.1, 80). So long as no 4-tuples are the same, the server can have as many connections on port 80 as its memory will allow. – Cornstalks Dec 4 '15 at 2:36

    However, when evaluating whether or not you'll go over the limits, you also have to consider that nginx is not just a server (having ngx_connection.c#ngx_open_listening_sockets() call socket(2), bind(2) and listen(2) system calls to take over ports like 80, and subsequently calling accept(2) in an infinite loop), but it is also potentially a client of an upstream server (calling socket(2) and connect(2) to connect to upstreams on ports like 8080 as needed).

    Note that whereas running out of TCP ports would not be possible for its server context (because a single port is used by the server across all of its connections — e.g., port 80), running out of TCP ports on the client side is a real possibility, depending on configuration. You also have to consider that after the client does a close(2) on the connection, the state goes to TIME_WAIT for a period of some 60s or so (to ensure that if any late-arriving packets do make it through, that the system will know what to do with them).

    However, with that said, note that the SO_REUSEPORT option to getsockopt(2), at least in the sharding context presented in the referenced release notes and reuseport announcement of nginx 1.9.1, is entirely unrelated to the 65535 dilemma -- it is merely a building block of having scalable multiprocessor support between the kernel and the applications that are running under the kernel:

    I ran a wrk benchmark with 4 NGINX workers on a 36-core AWS instance. To eliminate network effects, I ran both client and NGINX on localhost, and also had NGINX return the string OK instead of a file. I compared three NGINX configurations: the default (equivalent to accept_mutex on), with accept_mutex off, and with reuseport. As shown in the figure, reuseport increases requests per second by 2 to 3 times, and reduces both latency and the standard deviation for latency.

    Benchmarking reuseport in nginx 1.9.1

    As to your underlying question, the solution to the uint16_t issue of outgoing TCP ports would probably be to not use backends through TCP when this is of concern, and/or use extra local addresses through the proxy_bind et al directive (and/or to limit the number of TCP connections that can be established with the backends).