I've been reading the ZMQ documentation on heartbeats and read that one should use the ping-pong approach instead the one used for the Paranoid Pirate pattern
For Paranoid Pirate, we chose the second approach. It might not have been the simplest option: if designing this today, I'd probably try a ping-pong approach instead.
However, I find little to no documentation about the ping-pong pattern anywhere (and why is it preferred anyway?). The only possible code examples are ping.py and pong.py in the pyzmq examples.
Are these adequate examples that demonstrate a two-way heartbeat? If so, how is "pong" detecting that "ping" is not alive any more? There's also this claim about no payload, but isn't the ping message also considered a payload?
One peer sends a ping command to the other, which replies with a pong command. Neither command has any payload
Again, these examples may not constitute a full implementation of this approach. If anyone can share some experience, descriptions or code examples, I'd appreciate it.
My aim is to add heartbeat functionality to a broker and worker (router-dealer). Both worker and broker should detect that the partner isn't available any more and (a) deregister the worker (in case of the broker detecting the worker has gone), or (b) try to reconnect later (in case the worker lost its connection to the broker). The worker isn't required when busy, because it wouldn't be in the broker's idle workers queue for new jobs anyway.
ZeroMQ does not provide any mechanism to help you find out whether the socket on the other side is alive or not. Therefore, the standard scenario of the heartbeat pattern (it is the most convenient I think) is a heartbeat with timeout.
You need sockets on the client and server, which work in separate threads. And also a poller.
Poller example:
p = zmq.Poller()
p.register(socket, zmq.POLLIN)
Сlient sends a message to the server and polls the socket with timeout. Choose timeout value that most suits you and will clearly indicate that the server is not available.
Polling example:
msg = dict(p.poll(timeout))
if socket in msg and msg[socket] == zmq.POLLIN:
# we get heartbeat from server
else:
# timeout - server unavailable
Server does the same.
I think this could help.