Pivotal Knowledge Base

Follow

GemFire serialization FAQ

Environment

Product Version
Pivotal GemFire 6, 7, and 8

Overview

GemFire supports several types of serialization: DataSerializable, Java, PDX, and custom serialization. This article covers some common questions related to performance and usage of these different solutions.

Q: Is that correct that approximate rule of thumb for a general case is that PDX serialized objects take 20% RAM more than data serialized.

A: DataSerializable is very compact and fast, especially compared to standard Java Serialization. PDX is slightly larger but has other useful benefits that make it pretty much always the recommended choice (such as versioning, minimal deserialization, and so on).

Q: Is that correct that the query ”select * from /product p where p.sim_id=45097868” will deserialize all objects in the /product region in order to return result in case data serialization is used and no indexes exist in the system?

A: That is correct; the scan will require deserialization.

Q: Is that correct that the query ”select * from /product p where p.sim_id=45097868” will not cause deserialization of all objects in the /product region in order to return result  in a case  where PDX serialization is used and will, instead, access the sim_id field via PDX API?

A: That is correct; the PDX will minimize the deserialization on the server.

Q: Is that correct that the query ”select * from /product p where p.sim_id=45097868” will NOT cause deserialization of all objects in the /product region in order to return result in case if there is an index created for the sim_id field?

Answer: The process of indexing will cause all the data to be deserialized if not using PDX.

Recommendation

The common recommendation is to use PDX and where possible add one or two indexes per region to speed up the common queries that the users are doing. Tune GC for the size of the query result sets, especially NewEden space for temporary data.

Comments

Powered by Zendesk