rabbitmqrabbitmqctl

Rabbitmq join an up to date cluster


I have a cluster of two nodes A (master) and B (slave). A node was the master and B node joined A node successfully. Then the A node instance went down and B node instance has now more messages in queue than A node. I restarted the A node instance and I'm trying to join B node as a slave now because it it's more up to date. However, I'm getting the following message when trying to join B node:

sudo rabbitmqctl join_cluster rabbit@bnode

{:badrpc_multi, {:EXIT, {{:function_clause, [{:gen, :do_for_proc, [{:rex, {:error, {:node_name, :short}}}, #Function<0.9801092/1 in :gen.call/4>], [file: 'gen.erl', line: 220]}, {:gen_server, :call, 3, [file: 'gen_server.erl', line: 219]}, {:rpc, :do_call, 3, [file: 'rpc.erl', line: 327]}, {:lists, :foldl, 3, [file: 'lists.erl', line: 1263]}, {:rabbit_mnesia, :discover_cluster, 1, [file: 'src/rabbit_mnesia.erl', line: 779]}, {:rabbit_mnesia, :join_cluster, 2, [file: 'src/rabbit_mnesia.erl', line: 212]}, {:rpc, :"-handle_call_call/6-fun-0-", 5, [file: 'rpc.erl', line: 197]}]}, {:gen_server, :call, [{:rex, {:error, {:node_name, :short}}}, {:call, :rabbit_mnesia, :cluster_status_from_mnesia, [], #PID<0.62.0>}, :infinity]}}}, [error: {:node_name, :short}]}

Is that the correct approach to follow?

As I've read in some other posts I tried to remove existing mnesia data from A node: sudo rm -rf /var/lib/rabbitmq/mnesia/* and even tried with reset command (although it's not what I wanted) sudo rabbitmqctl reset

Still I cannot join B node.


Solution

  • I found the answer. Instead of trying to save a not synchronized node that threw that error it's better to launch another instance (using the autoscaling group from AWS for example) that will join that node B (up to date node).

    Based on that other answer:

    How to set up autoscaling RabbitMQ Cluster AWS