I'm working with Aeron cluster and I was wondering what is the best way to handle MAX_POSITION_EXCEEDED
publication result. Javadoc states "If this happens then the publication should be closed and a new one added. To make it less likely to happen then increase the term buffer length."
io.aeron.cluster.client.AeronCluster#offer(org.agrona.DirectBuffer, int, int)
io.aeron.cluster.client.AeronCluster#tryClaim
The above Aeron cluster client methods can return MAX_POSITION_EXCEEDED
. In that case I assume it is valid to call io.aeron.cluster.client.AeronCluster#close
and establish a new AeronCluster client instance and attempt to connect again.
How about Aeron clustered service? When the following methods return MAX_POSITION_EXCEEDED
do I just close the io.aeron.cluster.service.ClientSession#close
and wait for the client to re-establish a new connection?
io.aeron.cluster.service.ClientSession#offer
io.aeron.cluster.service.ClientSession#tryClaim
Thanks.
MAX_POSITION_EXCEEDED
can occur after the position has reached 2^32 * term length. Unless very small term lengths are used this is practically unlikely to happen. You need to capacity plan your streams so the term length is practical. This can be done by factoring in how much data will be transferred over the lifetime of a stream. The trade offs to consider include windows for flow control and memory footprint when buffers are not sparse and/or pre-touched.
If the condition occurs then you need to close the stream and open a new one. In the case of the cluster log you need to re-baseline the log from a snapshot with the new configuration. As I said above this condition is practically not going to happen with the correct configuration.