2 followers Follow

Recovery on multi-site WAN topology

We are using the multi-site WAN configuration. We have two clusters across geographical distances in North America and Europe. Cluster 1 has two members A and B that are both gateway senders. Cluster B has two members C and D that are both gateway receivers. When member A in cluster 1 starts, it reads data from database and loads it into the gemfire cache which gets sent to the cluster 2. Everything so far is good.
Problem: If both members in Cluster 2 are restarted at the same time, they lose all the gemfire regions/data. At that point if we restart member A in cluster 1, it again loads data from the DB and gets ushed to cluster B. But we would prefer to avoid the restart of member A. Is there a solution where if cluster 2 is restarted, it can request a full copy of data from cluster 1?

Jeewan Jyot Singh

Please sign in to leave a comment.

1 comment


Hello Jeewan,

I don't think there's an automatic way to make the gateway senders on cluster 1 to re send all events to the gateway receivers on cluster 2. There are, either way, two options that come to my mind to avoid restarting member A:

  1. Use persistent regions on cluster 2. This way the members C and D will recover the data from disk on startup, and further updates from cluster 1 will be propagated as usual.
  2. Write a small function that iterates over the region and simply executes two operations for entry: get and put. By deploying this function to cluster 1 you can easily fire it whenever you want from GFSH, causing the events to be replicated to cluster 2 without needing to restart member A.

Hope this helps.

Best regards.

Juan Ramos 0 votes