high-availabilityactivemq-artemis

ActiveMQ Artemis replication - feasible with just two nodes?


We are planning to use ActiveMQ Artemis HA replication with a single primary/backup pair using the new pluggable quorum voting feature.

This post (in response to my question earlier) suggests it's possible:

Generally speaking, a single primary/backup pair with replication is only recommended with the new pluggable quorum voting since the risk of split brain is so high otherwise.

However the documentation states:

network isolation protection requires configuring >=3 Zookeeper nodes

Since there are only a pair of nodes in our configuration, what are the implications of this statement?

Also, this link seems to indicate it's not really do-able with just two nodes:

If you're using replication then the minimum number of recommended nodes to mitigate split-brain is 6

but also:

The network pinger can be used to mitigate split-brain, but it can be tricky to configure. You'll need to perform your own testing to ensure it works the way you want.

My question is will this work at all in a two-node architecture? If so, does it require tricky configuration?

Most importantly, if we add a third node, will it be doable, relatively easily?

I think six nodes is out of the question.


Solution

  • The pluggable quorum voting requires integration with an external system to arbitrate the voting. The default implementation integrates with Apache ZooKeeper.

    Therefore, you'll need the primary broker, the backup broker, and at least one ZooKeeper node. However, if there is just one ZooKeeper node then this becomes a architectural weakness because if that node fails then high availability semantics may be impaired. Therefore, you probably want to provide some redundancy for ZooKeeper. The documentation recommends 3 or more ZooKeeper nodes for this reason.

    To be clear, ZooKeeper is a fairly common solution for problems of this type so it's relatively common for organizations to already have a ZooKeeper cluster deployed. Leveraging this infrastructure for ActiveMQ Artemis high availability makes good sense.

    The answer regarding minimum cluster deployment which you linked was written in 2020 before the pluggable quorum voting was implemented. Therefore that answer is out of date.

    The network pinger was never a "recommended" solution, per se. It was a kind of stop-gap measure for folks who just couldn't deploy the recommended 3 primary/backup pairs and also couldn't use shared storage, but they still wanted some mitigation for split brain. Again, this was implemented long before the new pluggable quorum voting.