pythonsocketstcprobustnesstcp-keepalive

robust continuous TCP connection (python socket)


My goal is to establish a continuous and robust TCP connection between one server and exactly one client. If one side fails, the other one should wait until it recovers.

I wrote the following code based on this question (that only asks for continuous, but not robust TCP connections and does not handle keepalive issues), this post and my own experience.

I have two questions:

  1. How can I make the keepalive work? If the server dies, the client only recognizes it after trying to send() - which worked also without the KEEPALIVE option as this results in a connection reset. Is there some way that the socket sends an interrupt for a connection that is dead or some keepalive function that I can check on a regular basis?

  2. Is this a robust way of handling a continous TCP connection? Having a stable, continous TCP connection seems to be a standard problem, however, I couldn't find tutorials covering this in detail. There must be some best-practice.

Note, I could handle keep alive messages on my own at the application level. However, as TCP already implements this at transport level, it is better to rely on this service provided by the lower level.

The server:

from socket import *
serverPort = 12000

while True:
    # 1. Configure server socket
    serverSocket = socket(AF_INET, SOCK_STREAM)
    serverSocket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
    serverSocket.bind(('127.0.0.1', serverPort))
    serverSocket.listen(1)
    print("waiting for client connecting...")
    connectionSocket, addr = serverSocket.accept()
    connectionSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
    print(connectionSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    print("...connected.")
    serverSocket.close() # Destroy the server socket; we don't need it anymore since we are not accepting any connections beyond this point.

    # 2. communication routine
    while True:
        try:
            sentence = connectionSocket.recv(512).decode()
        except ConnectionResetError as e:
            print("Client connection closed")
            break
        if(len(sentence)==0): # close if client closed connection
            break 
        else:
            print("recv: "+str(sentence))

    # 3. proper closure
    connectionSocket.shutdown(SHUT_RDWR)
    connectionSocket.close()
    print("connection closed.")

The client:

from socket import *
import time

while True:
    # 1. configure socket dest.
    serverName = '127.0.0.1'
    serverPort = 12000
    clientSocket = socket(AF_INET, SOCK_STREAM)
    try:
        clientSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
        clientSocket.connect((serverName, serverPort))
        print(clientSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    except ConnectionRefusedError as e:
        print("Server refused connection. retrying")
        time.sleep(1)
        continue

    # 2. communication routine
    while(1):
        sentence = input('input sentence: ')
        if(sentence == "close"):
            break
        try:
            clientSocket.send(sentence.encode())
        except ConnectionResetError as e:
            print("Server connection closed")
            break

    # 3. proper closure
    clientSocket.shutdown(SHUT_RDWR)
    clientSocket.close()

I tried to hold this example as minimal as possible. But given the requirement of robustness, it is relativley long.

I also tried some socket options as TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT.

Thank you!


Solution

  • I will try to answer both questions.

    1. ... Is there some way that the socket sends an interrupt for a connection that is dead ...

      I know none. TCP_KEEPALIVE only tries to maintain the connection. It is very useful if any equipment on the network flow has a timeout, because it prevents the timeout to abort the connection. But if the connection drops because because of any other reason (that timeout) TCP_KEEPALIVE cannot do anything. The rationale is that there is no need to restore a dropped inactive connection before something has to be exchanged.

    2. Is this a robust way of handling a continous TCP connection?

      Not really.

      The robust way is to be prepared that the connection fails for any reason at any moment. So you should be prepared to face an error when sending a message (your code is) and if that happens try to re-open the connection and send the message again (your current code does not). Something like:

      def connect(...):
          # establish and return a connection
          ...
          return clientSocket
      
      clientSocket = connect(...)
      while True:
          ...
          while True:
              try:
                  clientSocket.send(message)
                  break
              except OSError:
                  clientSocket = connect()
          ...
      

    Unrelated: your graceful shutdown is incorrect. The initiator (the part using shutdown) should not immediately close the socket, but start a read loop and only close when everything has be received and processed.