Pivotal Knowledge Base

Follow

Trouble shooting native memory issues

Applies to

GemFire 6 and later

Purpose

Java application VMs allocate thread stacks in native memory. Native memory is outside of the heap. Thus, the -Xms and -Xmx VM arguments have no effect on it. The -Xss VM argument is used to determine the thread stack size. The default is operating system and VM version dependent. An application can exhaust the native heap with thread allocations and still have plenty of heap. 

Issue

One way to know that there is a native memory issue is when an OutOfMemoryError with the message 'unable to create new native thread' is thrown either by GemFire or the application. The error must contain the 'unable to create new native thread' message and not the 'Java heap space' messageand not the 'Java heap space' message. See the article Trouble shooting Heap Memory Issues for details on that issue.. An example is shown below.

[severe 2008/09/29 10:56:12.919 EDT <Message Dispatcher for 127.0.0.1:2879> tid=0x56f]
Uncaught exception in thread <Message Dispatcher for 127.0.0.1:2879>
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
Another way to check whether there is a native memory issue is to use either the gemfire stats command or vsd to display the number of threads contained in a given GemFire statistics archive. The vmStats category shows the number of threads in the VM at any time.

The gemfire stats Command

Use the gemfire stats command to display the VMStats threads value contained in given GemFire statistics archive as shown below.

$ gemfire stats :VMStats.threads -archive=stats.gfs
[info] Found 1 match for ":VMStats.threads"
vmStats, 5112, VMStats: "2008/09/29 09:37:01.430 EDT" samples=4323
threads threads: samples=4323 min=37 max=127 average=83.2 stddev=6.93

Thread Dump

You can dump all the live threads of a running VM using kill -3 <pid>. This does not kill the VM. Instead it signals it to dump the current state of all of its live threads. An example is shown below:

[severe 2009/02/20 21:13:10.024 UTC libgemfire.so nid=0x40a18940] SIGQUIT received, dumping threads 
Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.5.0_16-b02 mixed mode):
"Pooled Message Processor548" daemon prio=1 tid=0x00 nid=0x197d in Object.wait()
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:432)
...

"ServerConnection on port 42400 Thread 262" prio=1 tid=0x00 nid=0x1829 in Object.wait()
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:432)
...

"P2P message reader for server(32508):35047/56260 SHARED=false ORDERED=true" daemon prio=1 tid=0x00 nid=0x1800 runnable
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
...

Solution

In simple terms, native memory is the difference between the physical RAM on the machine and the heap size of the VM(s). It is actually even less than that since the operating system also uses some of this memory. Given this, there are several ways to eliminate a native memory issue:

  1. Reduce the thread stack size of the VM using -Xss. Something like -Xss256k or -Xss384k should be sufficient.
  2. Reduce the max heap size of the VM using -Xmx. This will provide a greater difference between the physical RAM and the heap and thus more native memory.
  3. In case connections are leaking the above will only push the issue out further in the future. In the case where connections keeps building up it is important to find the root cause. Connections can build up for instance when using a version of Hyperic that isn't supported with the version of GemFire.
  4. Add RAM to the machine

Comments

Powered by Zendesk