activemq-artemisrolling-updates

ActiveMQ Artemis rolling upgrade fails with AMQ214013


I've got two EC2 instances running Artemis in the master-slave replication setup and I always perform rolling upgrade by shutting down the slave, upgrading it and starting it again. Then I do the same with the master. This no longer works for me while upgrading from 2.17 to 2.18. After upgrading one of the nodes I always get this message on that node:

AMQ214013: Failed to decode packet: java.lang.IndexOutOfBoundsException: readerIndex(57) + length(1) exceeds writerIndex(57): PooledUnsafeDirectByteBuf(ridx: 57, widx: 57, cap: 57)

No matter what I do. I even tried to setup a fresh new cluster only adding replication related bits of configuration:

Master

<connectors>
    <connector name="broker-master">tcp://10.35.4.16:61616</connector>
    <connector name="broker-slave">tcp://10.35.4.211:61616</connector>
</connectors>

<cluster-connections>
   <cluster-connection name="cluster">
      <connector-ref>broker-master</connector-ref>
      <check-period>5000</check-period>
      <static-connectors>
         <connector-ref>broker-slave</connector-ref>
      </static-connectors>
   </cluster-connection>
</cluster-connections>

<ha-policy>
   <replication>
      <master>
         <cluster-name>cluster</cluster-name>
         <group-name>rs1</group-name>
         <check-for-live-server>true</check-for-live-server>
      </master>
   </replication>
</ha-policy>

Slave

<connectors>
   <connector name="broker-master">tcp://10.35.4.16:61616</connector>
   <connector name="broker-slave">tcp://10.35.4.211:61616</connector>
</connectors>

<cluster-connections>
   <cluster-connection name="cluster">
      <connector-ref>broker-slave</connector-ref>
      <check-period>5000</check-period>
      <static-connectors>
         <connector-ref>broker-master</connector-ref>
      </static-connectors>
   </cluster-connection>
</cluster-connections>

<ha-policy>
   <replication>
      <slave>
         <cluster-name>cluster</cluster-name>
         <group-name>rs1</group-name>
         <allow-failback>true</allow-failback>
      </slave>
   </replication>
</ha-policy>

Is there anything wrong with my configuration, or is it just not possible to from 2.17 to 2.18 like this?

Full log from master

...
2022-04-07 10:00:17,610 INFO  [org.apache.activemq.artemis.core.server] AMQ221012: Using AIO Journal
...
2022-04-07 10:00:18,726 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address DLQ supporting [ANYCAST]
2022-04-07 10:00:18,728 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue DLQ on address DLQ
2022-04-07 10:00:18,759 INFO  [org.apache.activemq.artemis.core.server] AMQ221080: Deploying address ExpiryQueue supporting [ANYCAST]
2022-04-07 10:00:18,760 INFO  [org.apache.activemq.artemis.core.server] AMQ221003: Deploying ANYCAST queue ExpiryQueue on address ExpiryQueue
2022-04-07 10:00:19,072 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:61616 for protocols [CORE,MQTT,AMQP,STOMP,HORNETQ,OPENWIRE]
2022-04-07 10:00:19,075 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:5445 for protocols [HORNETQ,STOMP]
2022-04-07 10:00:19,079 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:5672 for protocols [AMQP]
2022-04-07 10:00:19,082 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:1883 for protocols [MQTT]
2022-04-07 10:00:19,085 INFO  [org.apache.activemq.artemis.core.server] AMQ221020: Started EPOLL Acceptor at 0.0.0.0:61613 for protocols [STOMP]
2022-04-07 10:00:19,086 INFO  [org.apache.activemq.artemis.core.server] AMQ221007: Server is now live
2022-04-07 10:00:19,086 INFO  [org.apache.activemq.artemis.core.server] AMQ221001: Apache ActiveMQ Artemis Message Broker version 2.17.0 [broker-master, nodeID=eb08fe05-b640-11ec-b6e2-06e66cf4b718]
2022-04-07 10:00:19,470 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2022-04-07 10:00:19,590 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2022-04-07 10:00:20,032 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2022-04-07 10:00:20,123 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2022-04-07 10:00:20,126 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.11.0
2022-04-07 10:00:20,137 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2022-04-07 10:00:20,171 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2022-04-07 10:00:20,182 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/opt/artemis/broker-master/etc/jolokia-access.xml]
2022-04-07 10:00:20,342 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://10.35.4.16:8161
2022-04-07 10:00:20,343 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://10.35.4.16:8161/console/jolokia
2022-04-07 10:00:20,343 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://10.35.4.16:8161/console
2022-04-07 10:00:52,429 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending AIOSequentialFile{activemq-data-49.amq, opened=false, pendingClose=false, pendingCallbacks=org.apache.activemq.artemis.utils.AutomaticLatch@2907ccd6} (size=10,485,760) to replica.
2022-04-07 10:00:52,931 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile /opt/artemis/broker-master/data/bindings/activemq-bindings-34.bindings (size=1,048,576) to replica.
2022-04-07 10:00:52,963 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile /opt/artemis/broker-master/data/bindings/activemq-bindings-35.bindings (size=1,048,576) to replica.
2022-04-07 10:00:53,011 INFO  [org.apache.activemq.artemis.core.server] AMQ221025: Replication: sending NIOSequentialFile /opt/artemis/broker-master/data/bindings/activemq-bindings-1.bindings (size=1,048,576) to replica.
2022-04-07 10:00:53,073 WARN  [org.apache.activemq.artemis.core.server] AMQ222092: Connection to the backup node failed, removing replication now: ActiveMQRemoteDisconnectException[errorType=REMOTE_DISCONNECT message=null]
        at org.apache.activemq.artemis.core.remoting.server.impl.RemotingServiceImpl.connectionDestroyed(RemotingServiceImpl.java:582) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptor$Listener.connectionDestroyed(NettyAcceptor.java:942) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.lambda$channelInactive$0(ActiveMQChannelHandler.java:89) [artemis-core-client-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.17.0.jar:2.17.0]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
        at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.17.0.jar:2.17.0]

2022-04-07 10:01:23,032 WARN  [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=NULL, message=AMQ229114: Replication synchronization process timed out after waiting 30,000 milliseconds: ActiveMQReplicationTimeooutException[errorType=REPLICATION_TIMEOUT_ERROR message=AMQ229114: Replication synchronization process timed out after waiting 30,000 milliseconds]
        at org.apache.activemq.artemis.core.replication.ReplicationManager.sendSynchronizationDone(ReplicationManager.java:660) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:717) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:180) [artemis-server-2.17.0.jar:2.17.0]
        at java.base/java.lang.Thread.run(Thread.java:829) [java.base:]

2022-04-07 10:01:23,035 WARN  [org.apache.activemq.artemis.core.server] AMQ222251: Unable to start replication: ActiveMQReplicationTimeooutException[errorType=REPLICATION_TIMEOUT_ERROR message=AMQ229114: Replication synchronization process timed out after waiting 30,000 milliseconds]
        at org.apache.activemq.artemis.core.replication.ReplicationManager.sendSynchronizationDone(ReplicationManager.java:660) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:717) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:180) [artemis-server-2.17.0.jar:2.17.0]
        at java.base/java.lang.Thread.run(Thread.java:829) [java.base:]

2022-04-07 10:01:23,045 WARN  [org.apache.activemq.artemis.core.server] AMQ222013: Error when trying to start replication: ActiveMQReplicationTimeooutException[errorType=REPLICATION_TIMEOUT_ERROR message=AMQ229114: Replication synchronization process timed out after waiting 30,000 milliseconds]
        at org.apache.activemq.artemis.core.replication.ReplicationManager.sendSynchronizationDone(ReplicationManager.java:660) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.persistence.impl.journal.JournalStorageManager.startReplication(JournalStorageManager.java:717) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.server.impl.SharedNothingLiveActivation$2.run(SharedNothingLiveActivation.java:180) [artemis-server-2.17.0.jar:2.17.0]
        at java.base/java.lang.Thread.run(Thread.java:829) [java.base:]

2022-04-07 10:01:23,058 WARN  [org.apache.activemq.artemis.utils.actors.OrderedExecutor] Server locator is closed (maybe it was garbage collected): java.lang.IllegalStateException: Server locator is closed (maybe it was garbage collected)
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.assertOpen(ServerLocatorImpl.java:1848) [artemis-core-client-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:648) [artemis-core-client-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:549) [artemis-core-client-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:528) [artemis-core-client-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.core.server.cluster.ClusterController$ConnectRunnable.run(ClusterController.java:433) [artemis-server-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.17.0.jar:2.17.0]
        at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65) [artemis-commons-2.17.0.jar:2.17.0]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [java.base:]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [java.base:]
        at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.17.0.jar:2.17.0]

2022-04-07 10:01:23,297 INFO  [io.hawt.web.auth.AuthenticationFilter] Destroying hawtio authentication filter
2022-04-07 10:01:23,303 INFO  [io.hawt.HawtioContextListener] Destroying hawtio services
2022-04-07 10:01:23,325 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed artemis-plugin plugin
2022-04-07 10:01:23,336 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed activemq-branding plugin
2022-04-07 10:01:23,384 INFO  [org.apache.activemq.artemis.core.server] AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.17.0 [eb08fe05-b640-11ec-b6e2-06e66cf4b718] stopped, uptime 1 minute

Full log from slave

2022-04-07 10:00:50,645 INFO  [org.apache.activemq.artemis.integration.bootstrap] AMQ101000: Starting ActiveMQ Artemis Server
2022-04-07 10:00:50,703 INFO  [org.apache.activemq.artemis.core.server] AMQ221000: backup Message Broker is starting with configuration Broker Configuration (clustered=true,journalDirectory=data/journal,bindingsDirectory=data/bindings,largeMessagesDirectory=data/large-messages,pagingDirectory=data/paging)
2022-04-07 10:00:50,747 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too many old replicated folders upon startup, removing /opt/artemis/broker-slave/data/bindings/oldreplica.7
2022-04-07 10:00:50,750 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory /opt/artemis/broker-slave/data/bindings to /opt/artemis/broker-slave/data/bindings/oldreplica.9
2022-04-07 10:00:50,753 INFO  [org.apache.activemq.artemis.core.server] AMQ221055: There were too many old replicated folders upon startup, removing /opt/artemis/broker-slave/data/journal/oldreplica.8
2022-04-07 10:00:50,755 INFO  [org.apache.activemq.artemis.core.server] AMQ222162: Moving data directory /opt/artemis/broker-slave/data/journal to /opt/artemis/broker-slave/data/journal/oldreplica.10
2022-04-07 10:00:50,888 INFO  [org.apache.activemq.artemis.core.server] AMQ221012: Using AIO Journal
2022-04-07 10:00:51,031 WARN  [org.apache.activemq.artemis.core.server] AMQ222007: Security risk! Apache ActiveMQ Artemis is running with the default cluster admin user and default password. Please see the cluster chapter in the ActiveMQ Artemis User Guide for instructions on how to change this.
2022-04-07 10:00:51,124 INFO  [org.apache.activemq.artemis.core.server] AMQ221057: Global Max Size is being adjusted to 1/2 of the JVM max size (-Xmx). being defined as 1,073,741,824
2022-04-07 10:00:51,323 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-server]. Adding protocol support for: CORE
2022-04-07 10:00:51,326 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-amqp-protocol]. Adding protocol support for: AMQP
2022-04-07 10:00:51,328 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-hornetq-protocol]. Adding protocol support for: HORNETQ
2022-04-07 10:00:51,329 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-mqtt-protocol]. Adding protocol support for: MQTT
2022-04-07 10:00:51,330 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-openwire-protocol]. Adding protocol support for: OPENWIRE
2022-04-07 10:00:51,330 INFO  [org.apache.activemq.artemis.core.server] AMQ221043: Protocol module found: [artemis-stomp-protocol]. Adding protocol support for: STOMP
2022-04-07 10:00:51,538 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Initialized activemq-branding plugin
2022-04-07 10:00:51,685 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Initialized artemis-plugin plugin
2022-04-07 10:00:52,052 INFO  [org.apache.activemq.artemis.core.server] AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.18.0 [null] started, waiting live to fail before it gets active
2022-04-07 10:00:52,224 INFO  [io.hawt.HawtioContextListener] Initialising hawtio services
2022-04-07 10:00:52,337 INFO  [io.hawt.system.ConfigManager] Configuration will be discovered via system properties
2022-04-07 10:00:52,341 INFO  [io.hawt.jmx.JmxTreeWatcher] Welcome to Hawtio 2.13.5
2022-04-07 10:00:52,351 INFO  [io.hawt.web.auth.AuthenticationConfiguration] Starting hawtio authentication filter, JAAS realm: "activemq" authorized role(s): "amq" role principal classes: "org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal"
2022-04-07 10:00:52,383 INFO  [io.hawt.web.proxy.ProxyServlet] Proxy servlet is disabled
2022-04-07 10:00:52,392 INFO  [io.hawt.web.servlets.JolokiaConfiguredAgentServlet] Jolokia overridden property: [key=policyLocation, value=file:/opt/artemis/broker-slave/etc/jolokia-access.xml]
2022-04-07 10:00:52,570 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://localhost:8161
2022-04-07 10:00:52,571 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://localhost:8161/console/jolokia
2022-04-07 10:00:52,571 INFO  [org.apache.activemq.artemis] AMQ241004: Artemis Console available at http://localhost:8161/console
2022-04-07 10:00:53,029 ERROR [org.apache.activemq.artemis.core.client] AMQ214013: Failed to decode packet: java.lang.IndexOutOfBoundsException: readerIndex(57) + length(1) exceeds writerIndex(57): PooledUnsafeDirectByteBuf(ridx: 57, widx: 57, cap: 57)
        at io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1442) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.buffer.AbstractByteBuf.readByte(AbstractByteBuf.java:730) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.buffer.WrappedByteBuf.readByte(WrappedByteBuf.java:529) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at org.apache.activemq.artemis.core.buffers.impl.ChannelBufferWrapper.readByte(ChannelBufferWrapper.java:300) [artemis-commons-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.wireformat.ReplicationStartSyncMessage.decodeRest(ReplicationStartSyncMessage.java:158) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.PacketImpl.decode(PacketImpl.java:371) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.ServerPacketDecoder.slowPathDecode(ServerPacketDecoder.java:277) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.ServerPacketDecoder.decode(ServerPacketDecoder.java:149) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:388) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1263) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:73) [artemis-core-client-2.18.0.jar:2.18.0]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.18.0.jar:2.18.0]

2022-04-07 10:00:53,031 ERROR [org.apache.activemq.artemis.core.client] AMQ214031: Failed to decode buffer, disconnect immediately.: java.lang.IllegalStateException: java.lang.IndexOutOfBoundsException: readerIndex(57) + length(1) exceeds writerIndex(57): PooledUnsafeDirectByteBuf(ridx: 57, widx: 57, cap: 57)
        at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:401) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$DelegatingBufferHandler.bufferReceived(ClientSessionFactoryImpl.java:1263) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.remoting.impl.netty.ActiveMQChannelHandler.channelRead(ActiveMQChannelHandler.java:73) [artemis-core-client-2.18.0.jar:2.18.0]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:795) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:480) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118) [artemis-commons-2.18.0.jar:2.18.0]
Caused by: java.lang.IndexOutOfBoundsException: readerIndex(57) + length(1) exceeds writerIndex(57): PooledUnsafeDirectByteBuf(ridx: 57, widx: 57, cap: 57)
        at io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1442) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.buffer.AbstractByteBuf.readByte(AbstractByteBuf.java:730) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at io.netty.buffer.WrappedByteBuf.readByte(WrappedByteBuf.java:529) [netty-all-4.1.66.Final.jar:4.1.66.Final]
        at org.apache.activemq.artemis.core.buffers.impl.ChannelBufferWrapper.readByte(ChannelBufferWrapper.java:300) [artemis-commons-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.wireformat.ReplicationStartSyncMessage.decodeRest(ReplicationStartSyncMessage.java:158) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.PacketImpl.decode(PacketImpl.java:371) [artemis-core-client-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.ServerPacketDecoder.slowPathDecode(ServerPacketDecoder.java:277) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.ServerPacketDecoder.decode(ServerPacketDecoder.java:149) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.bufferReceived(RemotingConnectionImpl.java:388) [artemis-core-client-2.18.0.jar:2.18.0]
        ... 20 more

2022-04-07 10:00:53,052 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to 10.35.4.16/10.35.4.16:61616 has been detected: java.lang.IndexOutOfBoundsException: readerIndex(57) + length(1) exceeds writerIndex(57): PooledUnsafeDirectByteBuf(ridx: 57, widx: 57, cap: 57) [code=GENERIC_EXCEPTION]
2022-04-07 10:00:53,092 INFO  [org.apache.activemq.artemis.core.server] AMQ221066: Initiating quorum vote: LiveFailoverQuorumVote
2022-04-07 10:00:53,093 INFO  [org.apache.activemq.artemis.core.server] AMQ221084: Requested 0 quorum votes
2022-04-07 10:00:53,093 INFO  [org.apache.activemq.artemis.core.server] AMQ221083: ignoring quorum vote as max cluster size is 1.
2022-04-07 10:00:53,094 INFO  [org.apache.activemq.artemis.core.server] AMQ221071: Failing over based on quorum vote results.
2022-04-07 10:00:53,113 ERROR [org.apache.activemq.artemis.core.server] AMQ224000: Failure in initialisation: ActiveMQIllegalStateException[errorType=ILLEGAL_STATE message=AMQ229026: Backup Server was not yet in sync with live]
        at org.apache.activemq.artemis.core.server.impl.SharedNothingBackupActivation.run(SharedNothingBackupActivation.java:302) [artemis-server-2.18.0.jar:2.18.0]
        at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$ActivationThread.run(ActiveMQServerImpl.java:4271) [artemis-server-2.18.0.jar:2.18.0]

2022-04-07 10:01:23,105 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to 10.35.4.16/10.35.4.16:61616 has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]
2022-04-07 10:01:23,107 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure to 10.35.4.16/10.35.4.16:61616 has been detected: AMQ219015: The connection was disconnected because of server shutdown [code=DISCONNECTED]
2022-04-07 10:01:42,243 WARN  [org.apache.activemq.artemis.core.client] AMQ212004: Failed to connect to server.
2022-04-07 10:01:42,274 INFO  [io.hawt.web.auth.AuthenticationFilter] Destroying hawtio authentication filter
2022-04-07 10:01:42,278 INFO  [io.hawt.HawtioContextListener] Destroying hawtio services
2022-04-07 10:01:42,296 INFO  [org.apache.activemq.hawtio.plugin.PluginContextListener] Destroyed artemis-plugin plugin
2022-04-07 10:01:42,304 INFO  [org.apache.activemq.hawtio.branding.PluginContextListener] Destroyed activemq-branding plugin

Solution

  • This bug was introduced via ARTEMIS-3340. It should be fixed in the next release (i.e. 2.22.0). Until then there's no work-around unfortunately.

    You may consider creating a completely separate new live/backup pair and then using a DNS update to direct clients from the old pair to the new pair.