linuxrabbitmqopenstack

rabbitmq cluster mistmatch hostname issue


I have deployed openstack-ansible with 3 node rabbitmq cluster and it use lxc to run rabbitmq on top, I am seeing very strange error here when i did rabbitmqctl status command, if you notice its talking to wrong node ostack-controller-01 is host node and not a actual rabbitmq node..

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 ~]# rabbitmqctl status
Status of node 'rabbit@ostack-controller-01' ...
Error: unable to connect to node 'rabbit@ostack-controller-01': nodedown

DIAGNOSTICS
===========

attempted to contact: ['rabbit@ostack-controller-01']

rabbit@ostack-controller-01:
  * unable to connect to epmd (port 4369) on ostack-controller-01: address (cannot connect to host/port)

current node details:
- node name: 'rabbitmq-cli-06@ostack-controller-01-rabbit-mq-container-1bf6ede2'
- home dir: /var/lib/rabbitmq
- cookie hash: SssFdXBI7wTevePuCt5d9w==

How do i fix this behavior and tell rabbitmq to talk to correct host which is ostack-controller-01-rabbit-mq-container-1bf6ede2

I have tried forget_cluster_node but no luck, still throwing same error.

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 ~]# rabbitmqctl forget_cluster_node rabbit@ostack-controller-01
Removing node 'rabbit@ostack-controller-01' from cluster ...
Error: unable to connect to node 'rabbit@ostack-controller-01': nodedown

DIAGNOSTICS
===========

attempted to contact: ['rabbit@ostack-controller-01']

rabbit@ostack-controller-01:
  * unable to connect to epmd (port 4369) on ostack-controller-01: address (cannot connect to host/port)


current node details:
- node name: 'rabbitmq-cli-39@ostack-controller-01-rabbit-mq-container-1bf6ede2'
- home dir: /var/lib/rabbitmq
- cookie hash: SssFdXBI7wTevePuCt5d9w==

UPDATE: 1

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 rabbitmq]# rabbitmqctl -n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2 status
Status of node 'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2' ...
[{pid,8720},
 {running_applications,
     [{rabbitmq_management,"RabbitMQ Management Console","3.6.9"},
      {amqp_client,"RabbitMQ AMQP Client","3.6.9"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","3.6.9"},
      {rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.6.9"},
      {rabbit,"RabbitMQ","3.6.9"},
      {rabbit_common,
          "Modules shared by rabbitmq-server and rabbitmq-erlang-client",
          "3.6.9"},
      {xmerl,"XML parser","1.3.14"},
      {os_mon,"CPO  CXC 138 46","2.4.2"},
      {cowboy,"Small, fast, modular HTTP server.","1.0.4"},
      {ranch,"Socket acceptor pool for TCP protocols.","1.3.0"},
      {ssl,"Erlang/OTP SSL application","8.1.3.1.1"},
      {public_key,"Public key infrastructure","1.4"},
      {cowlib,"Support library for manipulating Web protocols.","1.0.2"},
      {crypto,"CRYPTO","3.7.4"},
      {inets,"INETS  CXC 138 49","6.3.9"},
      {compiler,"ERTS  CXC 138 10","7.0.4.1"},
      {asn1,"The Erlang ASN1 compiler version 4.0.4","4.0.4"},
      {syntax_tools,"Syntax tools","2.1.1"},
      {mnesia,"MNESIA  CXC 138 12","4.14.3.1"},
      {sasl,"SASL  CXC 138 11","3.0.3"},
      {stdlib,"ERTS  CXC 138 10","3.3"},
      {kernel,"ERTS  CXC 138 10","5.2.0.1"}]},
 {os,{unix,linux}},
 {erlang_version,
     "Erlang/OTP 19 [erts-8.3.5.4] [source] [64-bit] [smp:6:6] [async-threads:128] [hipe] [kernel-poll:true]\n"},
 {memory,
     [{total,64189296},
      {connection_readers,179280},
      {connection_writers,26568},
      {connection_channels,124504},
      {connection_other,127440},
      {queue_procs,2832},
      {queue_slave_procs,0},
      {plugins,406280},
      {other_proc,21056136},
      {mnesia,500680},
      {metrics,205984},
      {mgmt_db,127256},
      {msg_index,47416},
      {other_ets,2692192},
      {binary,1591656},
      {code,24765630},
      {atom,1033401},
      {other_system,11505193}]},
 {alarms,[]},
 {listeners,
     [{clustering,25672,"::"},
      {amqp,5672,"::"},
      {'amqp/ssl',5671,"::"},
      {http,15672,"::"}]},
 {vm_memory_high_watermark,0.4},
 {vm_memory_limit,6662953369},
 {disk_free_limit,50000000},
 {disk_free,82822516736},
 {file_descriptors,
     [{total_limit,65436},
      {total_used,5},
      {sockets_limit,58890},
      {sockets_used,3}]},
 {processes,[{limit,1048576},{used,376}]},
 {run_queue,0},
 {uptime,14},
 {kernel,{net_ticktime,60}}]

UPDATE - 2

This is interesting... why following command working but not rabbitmqctl cluster_status?

[root@ostack-controller-01-rabbit-mq-container-1bf6ede2 rabbitmq]# rabbitmqctl -n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2 cluster_status
Cluster status of node 'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2' ...
[{nodes,
     [{disc,
          ['rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',
           'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
           'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13']}]},
 {running_nodes,
     ['rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
      'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',
      'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2']},
 {cluster_name,<<"openstack">>},
 {partitions,
     [{'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',
          ['rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',
           'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13']},
      {'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',
          ['rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc']}]},
 {alarms,
     [{'rabbit@ostack-controller-02-rabbit-mq-container-d510bdfc',[]},
      {'rabbit@ostack-controller-03-rabbit-mq-container-c482ee13',[]},
      {'rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2',[]}]}]

Solution

  • First things first, RabbitMQ 3.6.9 is old and you should be using the latest version.

    Having said that, that's not the issue. The output of echo $HOSTNAME is this:

    ostack-controller-01.foo.example.com
    

    So, when rabbitmqctl status runs, it uses this code to determine the node name to which to connect to. Since the HOSTNAME variable is set, that is used to determine the node name, and rabbitmqctl tries to use rabbit@ostack-controller-01, which fails.

    You can continue to use the -n rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2 argument to rabbitmqctl to work around this. Or, you can create the /etc/rabbitmq/rabbitmq-env.conf file with this content:

    NODENAME=rabbit@ostack-controller-01-rabbit-mq-container-1bf6ede2
    

    Then, rabbitmqctl status and other rabbitmqctl commands should work. You would then repeat this process on every node, using that node's correct name in /etc/rabbitmq/rabbitmq-env.conf