I am using a manager.dict
to synchronize some data between multiple workers of an API served with GUnicorn (with Meinheld workers). While this works fine for a few concurrent queries, it breaks when I fire about 100 queries simultaneously at the API and I get displayed the following stack trace:
2020-07-16 12:35:38,972-app.api.my_resource-ERROR-140298393573184-on_post-175-Ran out of input
Traceback (most recent call last):
File "/app/api/my_resource.py", line 163, in on_post
results = self.do_something(a, b, c, **d)
File "/app/user_data/data_lookup.py", line 39, in lookup_something
return (a in self._shared_dict
File "<string>", line 2, in __contains__
File "/usr/local/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod
kind, result = conn.recv()
File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
EOFError: Ran out of input
2020-07-16 12:35:38,972-app.api.my_resource-ERROR-140298393573184-on_post-175-unpickling stack underflow
Traceback (most recent call last):
File "/app/api/my_resource.py", line 163, in on_post
results = self.do_something(a, b, c, **d)
File "/app/user_data/data_lookup.py", line 39, in lookup_something
return (a in self._shared_dict
File "<string>", line 2, in __contains__
File "/usr/local/lib/python3.6/multiprocessing/managers.py", line 757, in _callmethod
kind, result = conn.recv()
File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
_pickle.UnpicklingError: unpickling stack underflow
My API framework is falcon. I have a dictionary containing user data that can be updated via POST requests. The architecture should be simple, so I chose Manager.dict()
from the multiprocessing
package to store the data. When doing other queries, this some input will be checked against the contents of this dictionary (if a in self._shared_dict: ...
). This is where the above-mentioned errors occur.
Why is this problem happening? It seems to be tied to the manager.dict
. Besides, when I do debugging in PyCharm, it also happens that the debugger does not evaluate any variables and often just hangs infinitely somewhere in multiprocessing
code waiting for data.
It seems to have something to do with the Meinheld
workers. When I configure GUnicorn to use the default sync
worker class, this error does not occur anymore. Hence, Python multiprocessing
and the Meinheld
package seem not to work well in my setting.