Pivotal Knowledge Base

Follow

How to use PDX with P2P topology and replicated regions

Environment

Product Version
Pivotal GemFire All versions

Purpose

This article will describe pitfalls and workarounds when using PDX serialization with P2P and replicated regions.

Cause

When using a client/server topology, the raw POJOs are only present on the client, while the server will only store the PDX serialized objects. This is the same for both partitioned and replicated regions.

However, when using a strictly peer-to-peer topology, the puts are done on the originating peer. If the region is partitioned, only the PDX serialized objects, which are stored as byte[], are present in the cache after garbage collection. When using a replicated region, though, both the raw POJOs and the PDX serialized objects remain in the cache after garbage collection.

The reason for this is that, when a put is done locally for a replicated region, the object is stored as is, in the domain object (non-PDX) form, in order to avoid excess serialization/deserialization. When using a partitioned region, where the data is partitioned across multiple nodes, the data is stored in the serialized form. GemFire was designed this way in keeping with performance and resource utilization considerations.

Procedure

There are two workarounds for this issue:

  1. The overhead of storing the raw POJOs in the cache can be avoided by using a PdxInstanceFactory.
  2. Another, easier, workaround to this problem is to configure a com.gemstone.gemfire.compression.Compressor on the region. The "compress" and "decompress" methods should simply return the byte array they were given.

    For example:

    public class NullCompressor implements com.gemstone.gemfire.compression.Compressor {
      public byte[] compress(byte[] input) {return input}
      public byte[] decompress(byte[] input) {return input}
    }
    

    The presence of a compressor will force all data added to be serialized.

Another thing to be aware of is that if the serialized data is accessed by something that wants the object form of the data, the data will be deserialized and stored in the cache as an object. This can happen when a function does a get on the server that stores the data (i.e. a local get), when a CacheListener calls getNewValue, or if you have an index on the region (index maintenance internally requests the object form).

Note, however, that this will not happen when a remote request for the object form is made (for example, by a client get). This policy setting can be changed with the system property gemfire.PREFER_SERIALIZED=true. Moreover, this system property does not address the initial issue (of what form the data will initially be stored) but does address this related secondary issue (of what form will the data be stored in after a request for the object form). Additionally, if using the compressor workaround, or, if the data is stored as PDX and read-serialized=true, then PREFER_SERIALIZED is not needed.

Comments

Powered by Zendesk