Pivotal Knowledge Base

Follow

Getting More Throughput on Replicated Regions over WAN

Environment

 Product  Version
 Pivotal GemFire  All supported versions

Purpose

Describe configurations and architectures to boost throughput for WAN replication for replicated regions.

Description

A partitioned region can use parallel gateway senders to get more throughput for WAN replication. But the replicated region can't use parallel gateway senders since only serial gateway senders are allowed. So, what ways are there to accelerate the speed of data synchronization for replicated regions in a GemFire WAN architecture?

 

  1. Setting dispatcher-threads=N creates N threads that would each send events to the other site. This is the easiest way to gain throughput.
  2. Another thing to consider is increasing the socket buffer size. That can also help achieve higher throughput. When determining buffer size settings, you are trying to strike a balance between communication needs and other processing. Larger socket buffers allow your members to distribute data and events more quickly, but they also take memory away from other things. If you store very large data objects in your cache, finding the right sizing for your buffers while leaving enough memory for the cached data can become critical to the system's performance. Ideally, you should have buffers large enough for the distribution of any single data object so you don’t get message fragmentation, which lowers performance. Your buffers should be at least as large as your largest stored objects and their keys, plus some overhead for the message headers. The overhead varies depending on who is sending and receiving, but 100 bytes should be sufficient. You can also look at the statistics for the communication between your processes to see how many bytes are being sent and received. In a multi-site installation using gateways, if the link between sites is not tuned for optimum throughput, it could cause messages to back up in the cache queues. If a receiving queue overflows because of inadequate buffer sizes, it will become out of sync with the sender and the receiver will be unaware of the condition. The gateway’s "socket-buffer-size" attribute should match the gateway hub’s "socket-buffer-size" attribute for the hubs the gateway sender connects to.
  3. Another solution is to revamp the architecture and create multiple unique serial senders, (one per member) and separate them from the replicated region by an empty proxy region, that has the serial sender configured on it. Events in the replicated region can be sent to the empty proxy region using a CacheWriter. The proxy sends each event to its configured sender. On the receiving side, you would have proxy regions receiving events and writing them to the replicated region. The downside to this idea is the loss of HA, but if you make your senders persistent, you won't lose any events. You would have to restart any crashed sender to get the events in its persistent queue delivered, though. The attached zip file contains sample Java code and XML configuration files to get started with this setup. Please note that the sample is provided "as is" without any kind of warranty.

 

 

Comments

Powered by Zendesk