pythonmultithreadingsockets

Killable socket in python


My goal is to emit an interface to listen on a socket forever ... until someone up the decision chain decides it's enough.

This is my implementation, it does not work. Mixing threads, sockets, object lifetime, default params and a language I do not speak too well is confusing.

I tested individually different aspects of this code and everything was as expected except the line containing the comment BUG where I attempt to force the main thread to block until the server hears the child screaming or a timeout passes but instead recv() simply doesn't see the change in alive.

#!/usr/bin/env python3


import socket
import threading
import time


MAX_MSG_BYTES=1024
TEST_PORT=42668


def recv( s: socket.socket, alive: bool=True ) -> bytes:
    '''
    Accepts packets on a socket until terminated.
    '''
    s.settimeout(1)  # 1 second
    while alive:
        print("'alive' is still", alive)
        try:
            data = s.recv(MAX_MSG_BYTES)
            assert data  # Empty packets were a problem.
            yield data
        except TimeoutError:
            pass  # expected error, any other is propagated up


def test_nonblocking_recv() -> None:
    # Create 3 sockets - sever administrative, server content and client content.
    # Bind the latter and forget about the former.
    server_s = socket.create_server(('', TEST_PORT))
    server_s.listen()
    client_s = socket.create_connection(('localhost', TEST_PORT))
    content_s = next(iter(server_s.accept()))  # Accept 1 connection.

    # client_s.sendall('If this is commented out, the server hangs.'.encode('utf8'))

    alive = True
    def read_one_message():
        data = recv(content_s, alive)
        print(next(iter(data)))  # BUG this causes outside alive to not be seen

    content_th = threading.Thread(target=read_one_message)
    content_th.start()
    time.sleep(3)
    alive = False
    print("But main thread 'alive' is", alive)

    content_th.join()
    assert threading.active_count() == 1


if __name__ == '__main__':
    test_nonblocking_recv()

Solution

  • I'm scared of globals. What I am attempting to do is pass a reference to "something somewhere that can be evaluated to bool".

    Global variables can be problematic - but sometimes they are the correct thing to use.

    "bool"s are scalar values in Python - when you pass alive as a parameter to your function, it will have its own reference of it (pointing to the True value), and it will never change no matter what you do on the main thread: when you assign to the local alive there, it puts a new reference, to False in the local name - the name in the other thread remains pointing to True. (we usually don't use the terms "pointing to" in Python, I am using they because I think that would be familiar to you).

    Just change alive to be a global variable there and it will work. If you want to constrain the variable scope, you could group are your functions in a class, and have alive be an instance attribute. In this way, other instances of the same class could, for example, listen to other ports.

    Anyway, it won't help saying you are "scared" of the correct, simplest thing to do there.

    In Python, only the functions which write to module level (i.e. global) variables have to declare them - they are read automatically as globals if they are not set in a function:

    #!/usr/bin/env python3
    
    
    import socket
    import threading
    import time
    
    
    MAX_MSG_BYTES=1024
    TEST_PORT=42668
    
    alive: bool   # declaration not needed, but helps with readability
    
    
    def recv( s: socket.socket) -> bytes:
        '''
        Accepts packets on a socket until terminated.
        '''
        s.settimeout(1)  # 1 second
        while alive:
            print("'alive' is still", alive)
            try:
                data = s.recv(MAX_MSG_BYTES)
                assert data  # Empty packets were a problem.
                yield data
            except TimeoutError:
                pass  # expected error, any other is propagated up
    
    
    def test_nonblocking_recv() -> None:
        global alive   # whenever a value is assigned to "alive" here, it goes into the 
                       #top level var.
        # Create 3 sockets - sever administrative, server content and client content.
        # Bind the latter and forget about the former.
        server_s = socket.create_server(('', TEST_PORT))
        server_s.listen()
        client_s = socket.create_connection(('localhost', TEST_PORT))
        content_s = next(iter(server_s.accept()))  # Accept 1 connection.
    
        # client_s.sendall('If this is commented out, the server hangs.'.encode('utf8'))
    
        alive = True
        def read_one_message():
            data = recv(content_s)
            print(next(iter(data)))  
    
        content_th = threading.Thread(target=read_one_message)
        content_th.start()
        time.sleep(3)
        alive = False
        print("But main thread 'alive' is", alive)
    
        content_th.join()
        assert threading.active_count() == 1
    
    
    if __name__ == '__main__':
        test_nonblocking_recv()
    
    

    But yes, you can use a mutable object instead of a scalar or a global variable.

    Without declaring a new class, a trick to do that is to use a container object, like a list, or dict: both your controller function and the worker will have a reference to the same object. You could have a 1-element list, for example, containing [True], and chaging that element to False would be visible in the worker:

    ...
    
    
    def recv( s: socket.socket, alive: list[bool]) -> bytes: # A mutable object must never be used as default value in  a function declaration - so we don~t set it.
        '''
        Accepts packets on a socket until terminated.
        '''
        s.settimeout(1)  # 1 second
        while alive[0]:
            print("'alive' is still", alive)
            ...
            
    def test_nonblocking_recv() -> None:
        ...
    
        alive = [True]   # a new list, with a single element
        def read_one_message():
            data = recv(content_s, alive)  # we pass the list itself as argument
            print(next(iter(data)))  
        
        ...
        alive[0] = False  #we change the first element on the list. doing `alive = [False]` would simply
                          # create a new reference here, while the worker wpuld keep its reference to the initial list.
                          
        print("But main thread 'alive' is", alive)
        ...
    
    

    And, if you don't want to use a container, you can create a special class which bool evaluation can be controlled - it happens that the truthness value of any object in Python can be determined by the output of a special named methodn __bool__ (if that is not present, Python will check if it is a container with length, and then it is Falsy if len(obj) == 0, Truthy otherwise, otherwise if it is a number with value "0" - otherwise, the special value None is False, and everything else is True)

    TL;DR: create a small class with an internal state which can be changed to modify it is truthyness:

    ...
    
    class Switch:
        def __init__(self, intial_state=True):
            self.state = initial_state
            
        def turn_off(self):
            self.state = False
            
        def __bool__(self):
            return self.state
            
    
    def recv( s: socket.socket, alive: Switch) -> bytes:
        ...
        s.settimeout(1)  # 1 second
        while alive:
            print("'alive' is still", alive)
            ...
            
    
    def test_nonblocking_recv() -> None:
        ...
    
        alive = Switch()
        def read_one_message():
            data = recv(content_s, alive) 
            print(next(iter(data)))  
        
        ...
        alive.turn_off()
                          
        print("But main thread 'alive' is", alive)
        ...
    

    Also, you could group test_nonblocking_recv and recv functions in a class, and use self.alive, as I stated earlier - or, simply move recv to be nested inside test_nonblocking_recv along with the read_one_message function: the two nested functions would see alive as a "nonlocal" variable, and everything would simply work (read_one_message already makes use of alive as a nonlocal variable in your code)