pythonmultithreadinggunicorn

why gunicorn use same thread


a simple python name myapp.py:


import threading
import os

def app(environ, start_response):
    tid = threading.get_ident()
    pid = os.getpid()
    ppid = os.getppid()
    
    # #####
    print('tid ================ ', tid) # why same tid?
    # #####
    print('pid', pid) # 
    print('ppid', ppid) # 

    data = b"Hello, World!\n"
    start_response("200 OK", [
        ("Content-Type", "text/plain"),
        ("Content-Length", str(len(data)))
    ])
    return iter([data])

And I start with gunicorn: gunicorn -w 4 myapp:app

[2022-03-28 21:59:57 +0800] [55107] [INFO] Starting gunicorn 20.1.0
[2022-03-28 21:59:57 +0800] [55107] [INFO] Listening at: http://127.0.0.1:8000 (55107)
[2022-03-28 21:59:57 +0800] [55107] [INFO] Using worker: sync
[2022-03-28 21:59:57 +0800] [55110] [INFO] Booting worker with pid: 55110
[2022-03-28 21:59:57 +0800] [55111] [INFO] Booting worker with pid: 55111
[2022-03-28 21:59:57 +0800] [55112] [INFO] Booting worker with pid: 55112
[2022-03-28 21:59:57 +0800] [55113] [INFO] Booting worker with pid: 55113

then I curl http://127.0.0.1:8000/ (or use a browser). logs below:

tid ================  4455738816
pid 55112
ppid 55107
tid ================  4455738816
pid 55111
ppid 55107
tid ================  4455738816
pid 55113
ppid 55107

the question is why the tid is same but the pid is not same.

ps: the code is from https://gunicorn.org/ homepage.


Solution

  • Gunicorn creates multiple processes to avoid the Python GIL. Each process has a unique PID.


    Regarding the threads, threading.get_ident() is a Python specific thread identifier, it should be regarded as meaningless and relevant only within the local process.

    Instead, you should use threading.get_native_id() which returns the unique system-wide thread identifier.

    Keep in mind the latter may be recycled and reused upon thread closure.