GemFire 6 and later
This article continues from the JVM and GC basics article with more details on tuning and also explains some advanced configurations for heaps that are bigger than 32GB.
Starting heap size considerations
Start with as large a heap as possible: 32G, 64G, 128G etc. and tune for the desired latency. Then if latency requirements cannot be met, reduce heap size.
When setting the initial heap size or reducing the heap size be aware of the 32G line. Don’t size the heap between 32G – 48G. Use 32GB or bigger than 48GB instead since heaps bigger than 32GB don't take advantage of compressed oops for better performance and space efficiency. You will probably find that you can fit the same amount of data and have better performance with 32GB than with 48GB.
We generally recommend a couple of canned heap sizes:
- 32G: low latency requirements
- 64G: larger data volumes + can tolerate higher latency
Bear in mind when tuning that GC can only be tuned for 2 out of the following 3 goals:
- Memory footprint
Make sure only long-lived objects are promoted to the old gen
We generally recommend starting tuning with 50% old generation headroom and start with young generation sized to 10% of the total heap. Young generation minor collections typically affect average latency while old generation major collections and full GC’s affect latency outliers. Both can affect throughput though.
The cost (duration) of a minor collection depends on “live objects”. In large heaps this cost is typically dominated by two components:
- Time to copy live objects
- Time to scan the card table (references to live objects from old gen)
The time to scan the card table is proportional to the old gen size so the larger the old generation, the longer the minor collections. In other words, the cost (duration) of a young generation collection is dominated by the following two:
- Time to copy live objects
- Size of the old generation
How to reduce the minor GC pauses in big heaps?
Adjust the young gen size to reduce the amount of live objects. Reduce the old generation impact by making the card table scan more efficient:
In tests with 32G and 96G heaps, these settings reduced the minor GC’s by over 60%: - 96G heap (5G young gen): minor pauses reduced from ~130ms to ~40ms while the frequency remained the same ( once every ~45s) - 32G heap (2G young gen): ~35ms down to ~13ms (same frequency, once every 20s) Lower or higher values than 32768 didn’t yield better results in my tests. 32768 seems to be the sweet spot
Reduce the tenured generation size either by increasing the young generation size, or reducing the total heap size
How to change the minor GC frequency?
The larger the young generation => the lower the frequency of minor GC’s. First, start at 10% of the total heap, measure and tune. Increasing young generation size means the tenured generation will be smaller which is good. But if increasing young generation size means there will be a lot more live objects to deal with (copy or promote), the GC time will still go up. Then reduce the young generation size instead.
On the other hand, if the application creates more short-lived objects, increasing the young generation may not result in increased GC time.
Things to avoid
- Premature promotion
- Allocating survivor space not large enough
- Short-lived object promoted
- Promotion failure
- Not enough space left in the old generation for promotion
- Full GC
- More headroom is better as it helps reduce the GC frequency
- Prevent full GC’s by making sure there is enough room for promotions
- When to kick in CMS GC with CMSInitiatingOccupancyFraction=?
Filling up the permanent generation will cause a full GC so make sure there is enough space, e.g.: