ActiveMQ Artemis primary pod goes to restart loop after change to shared-store HA option

Context

I'm trying to prepare working pair (single primary+backup) of Artemis 2.37.0 on AKS (k8s) cluster with persistence volumes (we use Azure Storage account). We use KUBE_PING for address discovery.

We have been using replication feature for several months, but the split brain problem occurs too often. I want to change it to shared-store.

The current (not working scenario)

After my change to shared store solution I face scenario with 4 steps:

primary doesn't work, backup changes to primary mode
After ~30 secs primary pod restarts, backup goes to backup mode
primary fails to start, backup goes to primary mode
The process repeats.

Expected behaviour

Primary works as primary without restarting. Backup works as backup.

Debugging

I searched primary logs (I can't paste here 5.5k lines) and found these before pod restarts:

2024-09-24 08:03:05,344 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock appears to be valid; double check by reading status
2024-09-24 08:03:05,344 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] getting state...
2024-09-24 08:03:05,344 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] trying to lock position: 0
2024-09-24 08:03:05,350 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] locked position: 0
2024-09-24 08:03:05,350 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] lock: sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]
2024-09-24 08:03:05,355 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] state: L
2024-09-24 08:03:05,355 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock appears to be valid; triple check by comparing timestamp
2024-09-24 08:03:05,357 DEBUG [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock file /var/lib/artemis-instance/data/journal/server.lock originally locked at 2024-09-24T08:02:33.067+0000 was modified at 2024-09-24T08:02:35.181+0000
2024-09-24 08:03:05,358 WARN  [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lost the lock according to the monitor, notifying listeners
2024-09-24 08:03:05,358 ERROR [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=Lost NodeManager lock, message=NULL
java.io.IOException: lost lock

In meantime there are errors related to Netty connection which looks more like warning that Artemis instance haven't stared yet. The artemis.artemis.svc.cluster.local is primary pod address (If I understand correctly netty on primary asks itself if it's working).

2024-09-24 08:03:02,454 ERROR [org.apache.activemq.artemis.core.client] AMQ214016: Failed to create netty connection
java.net.UnknownHostException: artemis.artemis.svc.cluster.local

Questions

What did I wrong? Do I miss an important parameter? Maybe there is some timeout to increase which I missed in the documentation?

For replication the same configuration is working (primary starts without restart loop).

Configuration files

  artemis-roles.properties: |
    amq = admin
    admin = admin,guest
  artemis-users.properties: |
    admin = admin
    guest = guest
  artemis.profile: |
    ARTEMIS_HOME='/opt/artemis'
    ARTEMIS_INSTANCE='/var/lib/artemis-instance'
    ARTEMIS_DATA_DIR='/var/lib/artemis-instance/data'
    ARTEMIS_ETC_DIR='/var/lib/artemis-instance/etc'
    ARTEMIS_OOME_DUMP='/var/lib/artemis-instance/log/oom_dump.hprof'
    ARTEMIS_INSTANCE_URI='file:/var/lib/artemis-instance/./'
    ARTEMIS_INSTANCE_ETC_URI='file:/var/lib/artemis-instance/./etc/'
    HAWTIO_ROLE='amq'
    if [ -z "$JAVA_ARGS" ]; then
        JAVA_ARGS="-XX:AutoBoxCacheMax=20000 -XX:+PrintClassHistogram -XX:+UseG1GC -XX:+UseStringDeduplication -Xms512M -Xmx2G -Dhawtio.disableProxy=true -Dhawtio.realm=activemq -Dhawtio.offline=true -Dhawtio.rolePrincipalClasses=org.apache.activemq.artemis.spi.core.security.jaas.RolePrincipal -Dhawtio.http.strictTransportSecurity=max-age=31536000;includeSubDomains;preload -Djolokia.policyLocation=${ARTEMIS_INSTANCE_ETC_URI}jolokia-access.xml -Dlog4j2.disableJmx=true "
    fi
    JAVA_ARGS="$JAVA_ARGS -Djava.net.preferIPv4Stack=true -Dipv4addr=$(hostname -i)"
    if [ "$1" = "run" ]; then :
    fi;
  broker.xml: |
    <configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">
        <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="urn:activemq:core ">
            <name>{{ include "artemis.fullname" . }}.{{ .Release.Namespace }}.svc.cluster.local</name>
            <persistence-enabled>true</persistence-enabled>
            <max-redelivery-records>1</max-redelivery-records>
            <paging-directory>/var/lib/artemis-instance/data/paging</paging-directory>
            <bindings-directory>/var/lib/artemis-instance/data/bindings</bindings-directory>
            <large-messages-directory>/var/lib/artemis-instance/data/large-messages</large-messages-directory>
            <id-cache-size xmlns="urn:activemq:core">20000</id-cache-size>
            <disk-scan-period>5000</disk-scan-period>
            <max-disk-usage>90</max-disk-usage>
            <critical-analyzer>true</critical-analyzer>
            <critical-analyzer-timeout>180000</critical-analyzer-timeout>
            <critical-analyzer-check-period>60000</critical-analyzer-check-period>
            <critical-analyzer-policy>SHUTDOWN</critical-analyzer-policy>
            <page-sync-timeout>512000</page-sync-timeout>
            <global-max-messages>-1</global-max-messages>
            <journal-type>ASYNCIO</journal-type>
            <journal-directory>/var/lib/artemis-instance/data/journal</journal-directory>
            <journal-datasync>true</journal-datasync>
            <journal-min-files>2</journal-min-files>
            <journal-pool-files>10</journal-pool-files>
            <journal-device-block-size>4096</journal-device-block-size>
            <journal-file-size>10M</journal-file-size>
            <journal-buffer-timeout>144000</journal-buffer-timeout>
            <journal-max-io>4096</journal-max-io>
            <xi:include href="/var/lib/artemis-instance/etc/acceptor.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/security-setting.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/cluster-connection.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/broadcast.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/address.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/address-setting.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/discovery.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/ha.xml"/>
            <xi:include href="/var/lib/artemis-instance/etc/connector.xml"/>
        </core>
    </configuration>
  acceptor.xml: |
    <acceptors xmlns="urn:activemq:core">
        <acceptor name="artemis">tcp://0.0.0.0:{{ .Values.conf.protocols.netty.port }}?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true;supportAdvisory=false;suppressInternalManagementObjects=false</acceptor>
        {{ if .Values.conf.protocols.amqp.enabled }}
        <acceptor name="amqp">tcp://0.0.0.0:5672?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=AMQP;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpMinLargeMessageSize=102400;amqpDuplicateDetection=true</acceptor>
        {{ end }}
        {{ if .Values.conf.protocols.stomp.enabled }}
        <acceptor name="stomp">tcp://0.0.0.0:{{ .Values.conf.protocols.stomp.port }}?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true</acceptor>
        {{ end }}
        {{ if .Values.conf.protocols.hornetq.enabled }}
        <acceptor name="hornetq">tcp://0.0.0.0:5445?anycastPrefix=jms.queue.;multicastPrefix=jms.topic.;protocols=HORNETQ,STOMP;useEpoll=true</acceptor>
        {{ end }}
        {{ if .Values.conf.protocols.mqtt.enabled }}
        <acceptor name="mqtt">tcp://0.0.0.0:1883?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=MQTT;useEpoll=true</acceptor>
        {{ end }}
        {{ if .Values.conf.protocols.ws.enabled }}
        <acceptor name="stomp-ws-acceptor">tcp://0.0.0.0:61614?protocols=STOMP_WS</acceptor>
        {{ end }}
    </acceptors>
  ha.xml: |
    <ha-policy xmlns="urn:activemq:core">
    # <replication> when replication enabled
        <shared-store>
            {{ .Values.conf.broker.ha  | indent 20 }}
        </shared-store>
    </ha-policy>
  cluster-connection.xml: |
    <cluster-connections xmlns="urn:activemq:core">
        <cluster-connection name="artemis">
            <address>jms</address>
            <connector-ref>{{ include "artemis.fullname" . }}</connector-ref>
            <check-period>1000</check-period>
            <connection-ttl>5000</connection-ttl>
            <min-large-message-size>50000</min-large-message-size>
            <call-timeout>120000</call-timeout>
            <retry-interval>500</retry-interval>
            <retry-interval-multiplier>1.0</retry-interval-multiplier>
            <max-retry-interval>5000</max-retry-interval>
            <initial-connect-attempts>-1</initial-connect-attempts>
            <reconnect-attempts>-1</reconnect-attempts>
            <use-duplicate-detection>true</use-duplicate-detection>
            <forward-when-no-consumers>false</forward-when-no-consumers>
            <max-hops>1</max-hops>
            <confirmation-window-size>10000000</confirmation-window-size>
            <call-failover-timeout>30000</call-failover-timeout>
            <notification-interval>1000</notification-interval>
            <notification-attempts>2</notification-attempts>
            <discovery-group-ref discovery-group-name="jgroups-discovery" />
        </cluster-connection>
    </cluster-connections>
  address.xml: |
    <addresses xmlns="urn:activemq:core">
        <address name="DLQ">
            <anycast>
            <queue name="DLQ" />
            </anycast>
        </address>
        <address name="ExpiryQueue">
            <anycast>
            <queue name="ExpiryQueue" />
            </anycast>
        </address>
    </addresses>
  address-setting.xml: |
    <address-settings xmlns="urn:activemq:core">
        <address-setting match="activemq.management#">
            <dead-letter-address>DLQ</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <max-size-bytes>-1</max-size-bytes>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
        </address-setting>
        <address-setting match="#">
            <dead-letter-address>DLQ</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
            <auto-delete-queues>false</auto-delete-queues>
            <auto-delete-addresses>false</auto-delete-addresses>
            <page-size-bytes>10M</page-size-bytes>
            <max-size-bytes>-1</max-size-bytes>
            <max-size-messages>-1</max-size-messages>
            <max-read-page-messages>-1</max-read-page-messages>
            <max-read-page-bytes>20M</max-read-page-bytes>
            <page-limit-bytes>-1</page-limit-bytes>
            <page-limit-messages>-1</page-limit-messages>
        </address-setting>
    </address-settings>
  broadcast.xml: |
    <broadcast-groups xmlns="urn:activemq:core">
      <broadcast-group name="jgroups-broadcast">
        <jgroups-file>jgroups-discovery.xml</jgroups-file>
        <jgroups-channel>activemq_broadcast_channel</jgroups-channel>
        <connector-ref>{{ include "artemis.fullname" . }}</connector-ref>
      </broadcast-group>
    </broadcast-groups>
  discovery.xml: |
    <discovery-groups xmlns="urn:activemq:core" >
      <discovery-group name="jgroups-discovery">
          <jgroups-file>jgroups-discovery.xml</jgroups-file>
          <jgroups-channel>activemq_broadcast_channel</jgroups-channel>
          <refresh-timeout>30000</refresh-timeout>
      </discovery-group>
    </discovery-groups>
  jgroups-discovery.xml: |
    <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="urn:org:jgroups"
        xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd">
        <TCP
            external_addr="match-interface:eth0"
            bind_addr="match-interface:eth0"
            bind_port="7800"
            thread_pool.min_threads="1"
        />
        <org.jgroups.protocols.kubernetes.KUBE_PING
            primaryProtocol="https"
            namespace="{{ .Release.Namespace }}"
            labels="rl-type={{ .Values.conf.kubePing.name }}"
        />
        <MERGE3 max_interval="30000" min_interval="10000"/>
        <FD_SOCK start_port="9000"/>
        <FD_ALL timeout="30000" interval="5000"/>
        <VERIFY_SUSPECT timeout="1500"/>
        <BARRIER />
        <pbcast.NAKACK2
            xmit_interval="500"
            xmit_table_num_rows="100"
            xmit_table_msgs_per_row="2000"
            xmit_table_max_compaction_time="30000"
            use_mcast_xmit="false"
            discard_delivered_msgs="true" />
        <UNICAST3
            xmit_table_num_rows="100"
            xmit_table_msgs_per_row="1000"
            xmit_table_max_compaction_time="30000"/>
        <pbcast.GMS print_local_addr="true" join_timeout="3000"/>
        <MFC max_credits="2M" min_threshold="0.4"/>
        <FRAG2 frag_size="60K"/>
        <pbcast.STATE_TRANSFER/>
        <COUNTER/>
    </config>
  log4j2.properties: |
    monitorInterval = 5
    rootLogger = {{ .Values.conf.log_level }}, console, log_file
    logger.activemq.name=org.apache.activemq
    logger.activemq.level={{ .Values.conf.log_level }}
    logger.artemis_server.name=org.apache.activemq.artemis.core.server
    logger.artemis_server.level={{ .Values.conf.log_level }}
    logger.artemis_journal.name=org.apache.activemq.artemis.journal
    logger.artemis_journal.level={{ .Values.conf.log_level }}
    logger.artemis_utils.name=org.apache.activemq.artemis.utils
    logger.artemis_utils.level={{ .Values.conf.log_level }}
    logger.critical_analyzer.name=org.apache.activemq.artemis.utils.critical
    logger.critical_analyzer.level={{ .Values.conf.log_level }}
    logger.audit_base = OFF, audit_log_file
    logger.audit_base.name = org.apache.activemq.audit.base
    logger.audit_base.additivity = false
    logger.audit_resource = OFF, audit_log_file
    logger.audit_resource.name = org.apache.activemq.audit.resource
    logger.audit_resource.additivity = false
    logger.audit_message = OFF, audit_log_file
    logger.audit_message.name = org.apache.activemq.audit.message
    logger.audit_message.additivity = false
    logger.jetty.name=org.eclipse.jetty
    logger.jetty.level=WARN
    logger.authentication_filter.name=io.hawt.web.auth.AuthenticationFilter
    logger.authentication_filter.level=ERROR
    logger.curator.name=org.apache.curator
    logger.curator.level=WARN
    logger.zookeeper.name=org.apache.zookeeper
    logger.zookeeper.level=ERROR
    appender.console.type=Console
    appender.console.name=console
    appender.console.layout.type=PatternLayout
    appender.console.layout.pattern=%d %-5level [%logger] %msg%n
    appender.log_file.type = RollingFile
    appender.log_file.name = log_file
    appender.log_file.fileName = ${sys:artemis.instance}/log/artemis.log
    appender.log_file.filePattern = ${sys:artemis.instance}/log/artemis.log.%d{yyyy-MM-dd}
    appender.log_file.layout.type = PatternLayout
    appender.log_file.layout.pattern = %d %-5level [%logger] %msg%n
    appender.log_file.policies.type = Policies
    appender.log_file.policies.cron.type = CronTriggeringPolicy
    appender.log_file.policies.cron.schedule = 0 0 0 * * ?
    appender.log_file.policies.cron.evaluateOnStartup = true
    appender.audit_log_file.type = RollingFile
    appender.audit_log_file.name = audit_log_file
    appender.audit_log_file.fileName = ${sys:artemis.instance}/log/audit.log
    appender.audit_log_file.filePattern = ${sys:artemis.instance}/log/audit.log.%d{yyyy-MM-dd}
    appender.audit_log_file.layout.type = PatternLayout
    appender.audit_log_file.layout.pattern = %d [AUDIT](%t) %msg%n
    appender.audit_log_file.policies.type = Policies
    appender.audit_log_file.policies.cron.type = CronTriggeringPolicy
    appender.audit_log_file.policies.cron.schedule = 0 0 0 * * ?
    appender.audit_log_file.policies.cron.evaluateOnStartup = true
  management.xml: |
    <management-context xmlns="http://activemq.apache.org/schema">
    <authorisation>
        <allowlist>
            <entry domain="hawtio"/>
        </allowlist>
        <default-access>
            <access method="list*" roles="amq"/>
            <access method="get*" roles="amq"/>
            <access method="is*" roles="amq"/>
            <access method="set*" roles="amq"/>
            <access method="browse*" roles="amq"/>
            <access method="count*" roles="amq"/>
            <access method="*" roles="amq"/>
        </default-access>
        <role-access>
            <match domain="org.apache.activemq.artemis">
                <access method="list*" roles="amq"/>
                <access method="get*" roles="amq"/>
                <access method="is*" roles="amq"/>
                <access method="set*" roles="amq"/>
                <access method="browse*" roles="amq"/>
                <access method="count*" roles="amq"/>
                <access method="*" roles="amq"/>
            </match>
        </role-access>
    </authorisation>
    </management-context>
  bootstrap.xml: |
    {{ if .Values.conf.protocols.http.enabled }}
    <broker xmlns="http://activemq.apache.org/schema">
        <jaas-security domain="activemq"/>
        <server configuration="file:/var/lib/artemis-instance/etc/broker.xml"/>
        <web path="web" rootRedirectLocation="console">
            <binding name="artemis" uri="http://0.0.0.0:{{ .Values.conf.protocols.http.port }}">
                <app name="branding" url="activemq-branding" war="activemq-branding.war"/>
                <app name="plugin" url="artemis-plugin" war="artemis-plugin.war"/>
                <app name="console" url="console" war="console.war"/>
            </binding>
        </web>
    </broker>
    {{ end }}
  jolokia-access.xml: |
    <restrict>
        <cors>
            <allow-origin>*://*</allow-origin>
            <strict-checking/>
        </cors>
    </restrict>
  login.config: |
    activemq {
        org.apache.activemq.artemis.spi.core.security.jaas.PropertiesLoginModule sufficient
            debug=false
            reload=true
            org.apache.activemq.jaas.properties.user="artemis-users.properties"
            org.apache.activemq.jaas.properties.role="artemis-roles.properties";
        org.apache.activemq.artemis.spi.core.security.jaas.GuestLoginModule sufficient
            debug=false
            org.apache.activemq.jaas.guest.user="amq"
            org.apache.activemq.jaas.guest.role="amq";
    };
  security-setting.xml: |
    <security-settings xmlns="urn:activemq:core">
        <security-setting match="#">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" roles="amq"/>
            <permission type="deleteAddress" roles="amq"/>
            <permission type="consume" roles="amq"/>
            <permission type="browse" roles="amq"/>
            <permission type="send" roles="amq"/>
            <permission type="manage" roles="amq"/>
        </security-setting>
    </security-settings>
  connector.xml: |
    <connectors xmlns="urn:activemq:core">
      <connector name="{{ include "artemis.fullname" . }}">tcp://{{ include "artemis.fullname" . }}.{{ .Release.Namespace }}.svc.cluster.local:{{ .Values.conf.protocols.netty.port }}</connector>
    </connectors>

shared-store HA block specific for primary:

      <primary>
        <failover-on-shutdown>true</failover-on-shutdown>
        <wait-for-activation>false</wait-for-activation>
      </primary>

shared-store HA block specific for backup

      <backup>
        <failover-on-shutdown>true</failover-on-shutdown>
        <allow-failback>true</allow-failback>
      </backup>

Replication HA block for primary (used before shared-store change)

      <primary>
        <check-for-active-server>true</check-for-active-server>
        <initial-replication-sync-timeout>600</initial-replication-sync-timeout>
      </primary>

Replication Ha block for backup (used before shared-store change)

      <backup>
        <allow-failback>true</allow-failback>
      </backup>

I googled for similar issues.

The only one which corresponds directly to the error doesn't apply to our scenario (no The system cannot find the path specified error)
I thought about NFS problems , but it's more about not optimal performance, not locks
We don't have any maintenance ongoing, this is the lock problem reason
There were some issues related directly to Artemis before e.g., but they were fixed with newer versions

Solution

The fact that you're seeing this error:

ERROR [org.apache.activemq.artemis.core.server] AMQ222010: Critical IO Error, shutting down the server. file=Lost NodeManager lock, message=NULL
java.io.IOException: lost lock

indicates that the shared storage device/protocol that you're using doesn't support the proper file locking semantics or perhaps file locking is not configured properly for the mount.

What's happening is that the primary broker is starting and acquiring a lock on the shared journal. When the backup broker starts it appears that it is also able to acquire the lock on the shared journal. When the backup modifies a file that should be locked by the primary the primary sees this and shuts itself down to avoid split brain.

I recommend you investigate the storage device/protocol you're using and ensure it supports exclusive file locking across the network and that such locking is properly configured.