Pivotal Knowledge Base

Follow

Tomcat cluster manager fails to transfer state with error: SEVERE: Manager [localhost]: No session state send (2009723)

Tomcat cluster manager fails to transfer state with error: SEVERE: Manager [localhost]: No session state send (2009723)

Symptoms

  • Sessions are not synchronized between nodes
  • User sessions counts differ in the Manager webapps on different nodes
  • You see different user session counts in the JMX Manager MBean on different nodes
  • When failing over via loadbalancer, a new session is created and the previous session state is lost
  • In the catalina log for at least one node, you see messages like:

    Note: This example is for the the petcare sample webapp.

    org.apache.catalina.ha.session.DeltaManager getAllClusterSessions
    WARNING: Manager [localhost#/petcare], requesting session state from org.apache.catalina.tribes.membership.MemberImpl[...]. This operation will
    timeout if no session state has been received within 60 seconds.
    org.apache.catalina.ha.session.DeltaManager waitForSendAllSessions
    SEVERE: Manager [localhost#/petcare]: No session state send at [time] received, timing out after 60,182 ms.

    Some other messages that may also be present in the log for some nodes:
    org.apache.catalina.ha.session.ClusterSessionListener messageReceived
    WARNING org.apache.catalina.ha.session.ClusterSessionListener.messageReceived Context manager doesn't exist:localhost#/petcare
    org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared
    INFO: Received memberDisappeared[...] message. Will verify.
    org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared
    INFO: Verification complete. Member still alive[...]

    org.apache.catalina.ha.session.DeltaManager getAllClusterSessions
    WARNING: Manager [localhost#/petcare]: Drop message SESSION-GET-ALL inside GET_ALL_SESSIONS sync phase

Cause

When a webapp is deployed with a cluster manager, the manager requests the session state for the context from another cluster member. The first block of errors above indicate that the transfer failed for some reason and that sessions for this context are not being replicated.

Several problems may lead to state transfer failure. A common reason why this occurs is that the same webapp is not deployed to all existing cluster members, or is not uniformly marked distributable. When a node is deploying the context and requests the state transfer from another node that hasn't deployed it with a cluster manager, the state transfer fails and the context is not clustered. The node requesting the state transfer reports the failure, and the node missing the context (or with a non-distributable context) reports the missing context cluster manager:

org.apache.catalina.ha.session.ClusterSessionListener messageReceived
WARNING org.apache.catalina.ha.session.ClusterSessionListener.messageReceived Context manager doesn't exist:localhost#/petcare

In this scenario, the order in which the nodes are started may make a difference - clustering may succeed with some orderings and not others.

Another problem that looks similar is when a node requests a state transfer from another node that is still starting up and has not yet deployed that context. The transfer fails with the same messages, but the node that was missing the context may be seen deploying it later in the logs. This is common when multiple nodes are started at about the same time.

Other messages listed above may indicate problems with the cluster configuration itself, such as when a node is requesting a state transfer from a node that is not available. A memberDisappeared message indicates a member with an irregular cluster heartbeat, perhaps because it is blocking (this can occur in the multiple-node startup scenario) or due to problems with the network, timing, or similar. If state transfers fail for all distributable contexts on a node, cluster configuration or networking problems are the likely causes.

Resolution

To resolve these issues, observe the guidelines:

  • Contexts that are clustered with DeltaManager need to be distributable and deployed uniformly across all nodes. For the node that reports Context manager doesn't exist:localhost#/context, make sure that the context given is deployed and distributable on that node. For more information, see Marking a web application as distributable for clustering under Tomcat

  • If you need to have a webapp deployed on some nodes and not others, consider using the BackupManager to cluster instead of the DeltaManager.

  • When starting multiple nodes, wait for each node to complete initialization before starting the next one.

  • If you are experiencing problems with cluster configuration as a whole:

    • Verify network connectivity. A packet trace may indicate network problems, or may be useful for VMware Support to analyze.

    • Make sure that the nodes are on hosts with the time synchronized to within a few seconds.

See Also

 
©VMware 2013

Comments

Powered by Zendesk