pythonpython-multiprocessinggunicornfalconframework

GUnicorn and shared dictionary on REST API: "Ran out of input" Error on high load


I am using a manager.dict to synchronize some data between multiple workers of an API served with GUnicorn (with Meinheld workers). While this works fine for a few concurrent queries, it breaks when I fire about 100 queries simultaneously at the API and I get displayed the following stack trace:

2020-07-16 12:35:38,972-app.api.my_resource-ERROR-140298393573184-on_post-175-Ran out of input
Traceback (most recent call last):
  File "/app/api/my_resource.py", line 163, in on_post
    results = self.do_something(a, b, c, **d)
  File "/app/user_data/data_lookup.py", line 39, in lookup_something
    return (a in self._shared_dict
  File "<string>", line 2, in __contains__
  File "/usr/local/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod
    kind, result = conn.recv()
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
EOFError: Ran out of input
2020-07-16 12:35:38,972-app.api.my_resource-ERROR-140298393573184-on_post-175-unpickling stack underflow
Traceback (most recent call last):
  File "/app/api/my_resource.py", line 163, in on_post
    results = self.do_something(a, b, c, **d)
  File "/app/user_data/data_lookup.py", line 39, in lookup_something
    return (a in self._shared_dict
  File "<string>", line 2, in __contains__
  File "/usr/local/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod
    kind, result = conn.recv()
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
_pickle.UnpicklingError: unpickling stack underflow

My API framework is falcon. I have a dictionary containing user data that can be updated via POST requests. The architecture should be simple, so I chose Manager.dict() from the multiprocessing package to store the data. When doing other queries, this some input will be checked against the contents of this dictionary (if a in self._shared_dict: ...). This is where the above-mentioned errors occur.

Why is this problem happening? It seems to be tied to the manager.dict. Besides, when I do debugging in PyCharm, it also happens that the debugger does not evaluate any variables and often just hangs infinitely somewhere in multiprocessing code waiting for data.


Solution

  • It seems to have something to do with the Meinheld workers. When I configure GUnicorn to use the default sync worker class, this error does not occur anymore. Hence, Python multiprocessing and the Meinheld package seem not to work well in my setting.