I have recently started to use ZeroMQ and have observed that sometimes initialization of the ZeroMQ context can take as much as 40 seconds. Usually it's much faster than that.
Eg:
import zmq
# the next line will sometimes take 40 seconds
# or so but is usually near instantaneous
with zmq.Context.instance() as context:
# connect / bind to socket, do stuff
pass
This behaviour seems to be independent of whether or not I use a context manager, eg the following behaves the same as the above:
import zmq
# the next line will sometimes take 40 seconds
# or so but is usually near instantaneous
context = zmq.Context.instance()
# connect / bind to socket, do stuff
context.term()
This is running in Python 3.6.8 on Ubuntu 18.04.2
I haven't been able to establish any pattern to the slowdowns.
What are the possible causes of this behaviour?
Per @user3666197 suggestion, I took some timings and managed to get the slowdown to manifest on the second run. Things returned to "normal" after that.
The slowdown kicked in on the second run of the test script:
= zmq.Context()] took + 704 [us] to execute
1.~ 704 us
2.~ 131 766 816 us
3.~ 145 us
4.~ 137 us
Watching memory on htop
at the time did not show memory filling up at all but swap was already active.
Using a pruned-down version of @user3666197's suggested script:
import sys
import zmq
clock = zmq.Stopwatch()
clock.start()
foo = zmq.Context()
bar = clock.stop()
print("{0:>2s}: {1:>9d}".format(sys.argv[1], bar))
foo.term()
And calling it in a loop, eg:
for x in $(seq 1 20);
do python3 test.zmq.py $x;
done
I can repeat the issue on the original server but have yet to hit it on another server, which suggests an environmental issue (or something to do with the specific distro version and/or python version as the "good" server is older).
This is sufficient for me for now but I will come back to it later and gather enough info to submit something to the maintainers.
What are the possible causes of this ( only "sometimes" present ) behaviour? "
I haven't been able to establish any pattern to the slowdowns.
a ) those 40 seconds,
mentioned above, could make sense but with a RAM to swap-I/O operations. I cannot imagine any other reason for such block of time ( I've never noticed anything close to the 3/4 of a minute delay in ZeroMQ during my last ~10 years with ZeroMQ, never )
b ) the reality check
always measure the respective actual times of execution, before any claim get raised :
let's start to be systematic and fair on repeatable and documented test-cases, ok?
Categories like "sometimes" are not worth, data are. As a start, we may use the following templated approach, to discriminate the root cause of any actual delay :
M = "[{0:_>40s}] took +{1:_>9d} [us] to execute"
N = 0
from zmq import Stopwatch; aClk = Stopwatch()
#------------------------------
pass; aClk.start()
import zmq
pass; _ = aClk.stop(); print( M.format( "IMPORT", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context()
pass; _ = aClk.stop(); print( M.format( "= zmq.Context()", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
context = zmq.Context.instance()
pass; _ = aClk.stop(); print( M.format( "= zmq.Context.instance()", _ ) )
#------------------------------
pass; aClk.start()
context.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
with zmq.Context.instance() as aCtxInCtxMGR:
N +=1
pass; _ = aClk.stop(); print( M.format( "with … as aCtxInCtxMGR:", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context( 0 )
pass; _ = aClk.stop(); print( M.format( "= zmq.Context( 0 )", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context( 10 )
pass; _ = aClk.stop(); print( M.format( "= zmq.Context( 10 )", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context( 20 )
pass; _ = aClk.stop(); print( M.format( "= zmq.Context( 20 )", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context( 50 )
pass; _ = aClk.stop(); print( M.format( "= zmq.Context( 50 )", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
pass; aClk.start()
aCTX = zmq.Context( 99 )
pass; _ = aClk.stop(); print( M.format( "= zmq.Context( 99 )", _ ) )
#------------------------------
pass; aClk.start()
aCTX.term()
pass; _ = aClk.stop(); print( M.format( ".term()", _ ) )
#------------------------------
et cetera
ad plausible
[__________________________________IMPORT] took +_____1198 [us] to execute
[_________________________= zmq.Context()] took +_____1523 [us] to execute
[________________= zmq.Context.instance()] took +_____1348 [us] to execute
[_________________with … as aCtxInCtxMGR:] took +_____7909 [us] to execute
[_____________________= zmq.Context( 0 )] took +______966 [us] to execute
[_____________________= zmq.Context( 10 )] took +______944 [us] to execute
[_____________________= zmq.Context( 20 )] took +______936 [us] to execute
[_____________________= zmq.Context( 50 )] took +______962 [us] to execute
[_____________________= zmq.Context( 99 )] took +_____1015 [us] to execute
[__________________________________IMPORT] took +______879 [us] to execute
[_________________________= zmq.Context()] took +_____1076 [us] to execute
[_________________________________.term()] took +______881 [us] to execute
[________________= zmq.Context.instance()] took +_____1105 [us] to execute
[_________________________________.term()] took +______896 [us] to execute
[_________________with … as aCtxInCtxMGR:] took +_____1474 [us] to execute
[_____________________= zmq.Context( 0 )] took +_____1163 [us] to execute
[_________________________________.term()] took +______982 [us] to execute
[_____________________= zmq.Context( 10 )] took +_____2646 [us] to execute
[_________________________________.term()] took +______873 [us] to execute
[_____________________= zmq.Context( 20 )] took +_____2689 [us] to execute
[_________________________________.term()] took +______981 [us] to execute
[_____________________= zmq.Context( 50 )] took +_____2772 [us] to execute
[_________________________________.term()] took +______910 [us] to execute
[_____________________= zmq.Context( 99 )] took +_____1086 [us] to execute
[_________________________________.term()] took +_____1010 [us] to execute