Pivotal Knowledge Base

Follow

Restarting cache server throws a ConcurrentModificationException at remote member

Environment

Product Version
Pivotal GemFire 7.x - 8.2.1.1

Overview

Describe why the  [java.util.ConcurrentModificationException at Remote Member] exception can be hit when restarting a GemFire cache server and what workarounds can be applied to resolve this issue.

Symptom

When starting several cache servers at a time, cache server A throws a ConcurrentModificationException when initializing a region like shown in the below log file exert. Cache server A failed to start while the other cache servers started successfully. When cache server A is restarted again, it starts successfully.

[info 2016/08/16 09:17:43.152 JST   tid=0x1] Initializing region exampleRegionX

[info 2016/08/16 09:17:43.219 JST   tid=0x1] Region exampleRegionX requesting initial image from 192.168.1.10(27972):10344

[info 2016/08/16 09:17:43.260 JST   tid=0x1] exampleRegionX failed to get image from 192.168.1.10(27972):10344

[warning 2016/08/16 09:17:43.264 JST   tid=0x1] Initialization failed for Region /exampleRegionX
com.gemstone.gemfire.ToDataException: toData failed on DataSerializer with id=0 for class class java.util.HashMap
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeUserObject(InternalDataSerializer.java:1482)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeWellKnownObject(InternalDataSerializer.java:1411)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2203)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeObject(DataSerializer.java:3179)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.util.BlobHelper.serializeTo(BlobHelper.java:65)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.AbstractRegionEntry.fillInValue(AbstractRegionEntry.java:342)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.chunkEntries(InitialImageOperation.java:1959)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.process(InitialImageOperation.java:1741)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:457)
    at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:692)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager$5$1.run(DistributionManager.java:1000)
    at Remote Member '192.168.1.10(27972):10344' in java.lang.Thread.run(Thread.java:745)
    at com.gemstone.gemfire.distributed.internal.ReplyException.handleAsUnexpected(ReplyException.java:75)
    at com.gemstone.gemfire.internal.cache.InitialImageOperation.getFromOne(InitialImageOperation.java:525)
    at com.gemstone.gemfire.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1421)
    at com.gemstone.gemfire.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1209)
    at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:2983)
    at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:2880)
    at com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:2869)
    at com.gemstone.gemfire.cache.RegionFactory.create(RegionFactory.java:841)
    at com.customer.framework.cache.impl.gemfire.CacheServerCacheManager.afterConnect(CacheServerCacheManager.java:147)
    at com.customer.framework.cache.impl.gemfire.GemFireCacheManager.(GemFireCacheManager.java:118)
    at com.customer.framework.cache.impl.gemfire.CacheServerCacheManager.(CacheServerCacheManager.java:95)
    at com.customer.framework.cache.impl.gemfire.GemFireCacheManager.doInit(GemFireCacheManager.java:73)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.customer.framework.cache.CacheManager.init(CacheManager.java:83)
    at com.customer.framework.process.Server.start(Server.java:139)
    at com.customer.framework.process.Server.execute(Server.java:104)
    at com.customer.framework.process.CacheServer.main(CacheServer.java:31)
Caused by: java.util.ConcurrentModificationException
    at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
    at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$EntryIterator.next(HashMap.java:1463)
    at Remote Member '192.168.1.10(27972):10344' in java.util.HashMap$EntryIterator.next(HashMap.java:1461)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeHashMap(DataSerializer.java:2603)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer$32.toData(InternalDataSerializer.java:508)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeUserObject(InternalDataSerializer.java:1451)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.writeWellKnownObject(InternalDataSerializer.java:1411)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.InternalDataSerializer.basicWriteObject(InternalDataSerializer.java:2203)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.DataSerializer.writeObject(DataSerializer.java:3179)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.util.BlobHelper.serializeTo(BlobHelper.java:65)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.AbstractRegionEntry.fillInValue(AbstractRegionEntry.java:342)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.chunkEntries(InitialImageOperation.java:1959)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.internal.cache.InitialImageOperation$RequestImageMessage.process(InitialImageOperation.java:1741)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:386)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:457)
    at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at Remote Member '192.168.1.10(27972):10344' in java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager.runUntilShutdown(DistributionManager.java:692)
    at Remote Member '192.168.1.10(27972):10344' in com.gemstone.gemfire.distributed.internal.DistributionManager$5$1.run(DistributionManager.java:1000)
    at Remote Member '192.168.1.10(27972):10344' in java.lang.Thread.run(Thread.java:745)

[info 2016/08/16 09:17:43.358 JST   tid=0xe] VM is exiting - shutting down distributed system

Cause

java.util.ConcurrentModificationException is a common exception when working with java collection classes such as the Hashmap class. Generally, the ConcurrentModificationException can be thrown in case of multiple threads as well as a single thread in the java programming environment such as, when a Collection is changed by one thread while another thread is traversing over it using iterator then iterator.next.

In the case of the above stack, the failure node was trying to get an initial image (GII) from Remote Member '192.168.1.10(27972):10344 and it threw java.util.ConcurrentModificationException from java.util.HashMap$HashIterator.nextNode when iterating the Hashmap containing the region entries object, whereas, the other thread was changing the Hashmap because of an add/put/destroy/invalidate operation.

Resolution

The ConcurrentModificationException is an expected exception in the described situation.

To avoid this exception and the related issues when starting cache servers, the following could be applied:

Workaround 1

Enable the copy-on-read parameter: 

Using cache.xml: 
<cache copy-on-read="true">

Using the GemFire Java API:
Cache c = CacheFactory.getInstance(system) c.setCopyOnRead(true);

You can find more details in the GemFire User's Guide here.

Workaround 2

Changing the cache servers start order can also resolve this issue.

 

Comments

Powered by Zendesk