apachejbossmod-cluster

Jboss Mod_cluster


I have a jboss cluster with 2 nodes (a and b) + 1 apache working as mod_cluster (apache in a separate server)

If one of the nodeA goes down, mod cluster can't connect to another one.

So, if nodeA crashes, I can't access jboss aplication by http://apache_server/myapp, but I can by http://nodeb/myapp and vice-versa

I dig on google almost all i have found say that is related to sessions but I can't fnd whats is wron with my config. (Mod_cluster as configured with this tool Load Balancer Configuration Tool

NodeA Log

15/05/2016 07:45:22,741 ERROR [org.jgroups.protocols.TCP] (http-/nodeA:8080-90) failed sending message to jbossnodeb:jbossnodeb/web (4148 bytes): java.net.SocketException: Socket closed, cause: null
15/05/2016 07:45:22,790 ERROR [org.jgroups.protocols.TCP] (OOB-6464,shared=tcp) failed sending message to jbossnodeb:jbossnodeb/web (4141 bytes): java.net.SocketException: Broken pipe, cause: null

NodeB Log

15/05/2016 07:45:23,126 ERROR [org.jgroups.protocols.TCP] (OOB-4949,shared=tcp) failed sending message to jbossnodea:jbossnodea/web (79 bytes): java.net.SocketException: Broken pipe, cause: null
15/05/2016 07:45:53,457 WARN  [org.jgroups.protocols.TCP] (Timer-1,shared=tcp) null: no physical address for jbossnodea:jbossnodea/web, dropping message

Apache mod_cluster server log

[Sun May 15 07:45:04 2016] [error] (70007)The timeout specified has expired: proxy: read response failed from (null) (nodeA_IP)
[Sun May 15 07:45:34 2016] [error] (70007)The timeout specified has expired: ajp_cping_cpong: apr_socket_recv failed
[Sun May 15 07:45:38 2016] [error] ajp_handle_cping_cpong: ajp_ilink_receive failed
[Sun May 15 07:45:38 2016] [error] (70007)The timeout specified has expired: proxy: AJP: cping/cpong failed to (null) (nodeA_IP)
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: ajp_cping_cpong: apr_socket_recv failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:44 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:44 2016] [error] (70007)The timeout specified has expired: proxy: dialog to nodeA_IP:8009 (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] (70007)The timeout specified has expired: proxy: dialog to (null) (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] (70007)The timeout specified has expired: proxy: dialog to (null) (nodeA_IP) failed
[Sun May 15 07:45:45 2016] [error] ajp_read_header: ajp_ilink_receive failed
[Sun May 15 07:45:45 2016] [error] proxy: CLUSTER: (balancer://clusterjboss). All workers are in error state

Config apache mod_cluster

AdvertiseGroup 225.0.1.107:23364
KeepAliveTimeout 60
ManagerBalancerName clusterjboss
ServerAdvertise On
AdvertiseFrequency 5
EnableMCPMReceive
CreateBalancers 0
AllowDisplay On

ProxyPass / balancer://clusterjboss/ stickysession=JSESSIONID|jsessionid nofailover=On

Solution

  • Visibility

    JGroups, Infinispan, Domains, Clustering

    mod_cluster, i.e. modcluster subsystem has nothing to do with the aforementioned whatsoever. The subsystem is completely oblivious to the fact that there is some cluster formed or that you have your instances in a domain -- which is also irrelevant to having your instances in a cluster in the first place. Don't bother with JGroups messages while investigating mod_cluster configuration.

    Although, if your JGroups cluster is broken...

    Infinispan - i.e. distributed or replicated cache of your web session data in this case, relies on JGroups for forming a cluster and for exchanging messages in this cluster. If your instances cannot for a cluster or fail to exchange messages, you might experience a loss of session data on failover.

    For example: Apache HTTP Server mod_cluster balacner decides to send request with JSESSIONID yadayadaXXX.worker-1 to worker-2, because worker-1 is down. Due to a network configuration error, worker-1 and worker-2 has never correctly formed a cluster, so worker-2 does not have the session data of worker-1. The result is a web application with a new session created, i.e. your client lost his context, e.g. shopping cart (popular showcase).

    ProxyPass

    Don't use it unless you have something specific in mind. The whole point of mod_cluster is that it creates all proxy directives in memory, on the fly dynamically as your worker nodes and their web applications come and go. You start fiddling with additional ProxyPass directives if you want to:

    ManagerBalancerName

    It is not clear to me why you would need to change it. If you change the default value, you have to also alter balancer="new_value" in your Jboss modcluster subsystem configuration. What is actually does is that it tells mod_cluster in the Apache HTTP Server to create more separate named ProxyPass Balacners internally. One then could use ProxyPass directives to tweak them separately. Do you need to tweak them? According to the rest of your config I am convinced it is not the case. For example, the session stickiness is configured in JBoss nodes in mod_cluster subsystems - worker ndoes report this to the Apache HTTP Server balancer.

    HTH, -K-