connectionglassfishopenmq

unregister imq broker from cluster properly


I am currently trying to manage an openmq cluster (with glassfish 3.0.1) and I encounter some strange behavior.

The cluster work for seven months now without any problems with 2 broker registered.

I now need to add two other broker in the cluster temporarily. Here is what I have before:

-------------------------
Host         Primary Port
-------------------------
localhost    7676

Cluster ID                    MyCluster
Cluster is Highly Available   true

-------------------------------------------------------------------------------------------------------------
                                                                          ID of broker       Time since last 
 Broker ID         Address               State         Msgs in store   performing takeover   status timestamp
-------------------------------------------------------------------------------------------------------------
Broker1         192.168.0.1:7676    OPERATING           5                                     6 seconds
Broker2         192.168.0.2:7676    OPERATING           8                                     6 seconds

Then, I successfully start two other broker on two other server and I got:

-------------------------
Host         Primary Port
-------------------------
localhost    7676

Cluster ID                    MyCluster
Cluster is Highly Available   true

-------------------------------------------------------------------------------------------------------------
                                                                          ID of broker       Time since last 
 Broker ID         Address               State         Msgs in store   performing takeover   status timestamp
-------------------------------------------------------------------------------------------------------------
Broker1         192.168.0.1:7676    OPERATING           5                                     6 seconds
Broker2         192.168.0.2:7676    OPERATING           8                                     6 seconds
Broker3         192.168.0.3:7676    OPERATING           5                                     6 seconds
Broker4         192.168.0.4:7676    OPERATING           8                                     6 seconds

The application run well with the configuration, automatically using the two new brokers. The problem occurs when I stop a broker from the cluster using the following command:

./imqcmd shutdown bkr

on one of the server. The result of ./imqcmd list bkr is the following:

-------------------------
Host         Primary Port
-------------------------
localhost    7676

Cluster ID                    MyCluster
Cluster is Highly Available   true

-------------------------------------------------------------------------------------------------------------
                                                                          ID of broker       Time since last 
 Broker ID         Address               State         Msgs in store   performing takeover   status timestamp
-------------------------------------------------------------------------------------------------------------
Broker1         192.168.0.1:7676    OPERATING           5                                     6 seconds
Broker2         192.168.0.2:7676    OPERATING           8                                     6 seconds
Broker3         192.168.0.3:7676    TAKEOVER_COMPLETE   0                 Broker1             6 seconds
Broker4         192.168.0.4:7676    OPERATING           8                                     6 seconds

Everything seems to be ok, takeover is performed by Broker1, but when I look at the server.log of the glassfishs, I found the following line :

[C4003]: Error occurred on connection creation [192.168.0.3:7676]. - cause: java.net.ConnectException: Connection refused|#]

Like if the glassfish try to connect to the shuted down broker.

Is there something I missed ?

Thanks for your help.


Solution

  • The missing command line was:

    imqdbmgr remove bkr -n Broker3
    
    then a list will ouput:
    
    -------------------------------------------------------------------------------------------------------------
                                                                              ID of broker       Time since last 
     Broker ID         Address               State         Msgs in store   performing takeover   status timestamp
    -------------------------------------------------------------------------------------------------------------
    Broker1         192.168.0.1:7676    OPERATING           5                                     6 seconds
    Broker2         192.168.0.2:7676    OPERATING           8                                     6 seconds
    Broker4         192.168.0.4:7676    OPERATING           8                                     6 seconds
    

    the Broker3 is no more registered in the broker HA cluster.