Pivotal Knowledge Base

Follow

What are Out of memory / VM Protect errors ?

Goal

The purpose of this article is to describe different types of out of memory messages seen during query processing in Greenplum / HAWQ, possible causes and recommendations to address the cause for the same.

Solution

Primarily, below is the list of different type of out of memory messages which can be found in Greenplum database segments / master logs.

  • VM Protect failed to allocate %d bytes, %d MB available
  • Per-query VM protect limit reached: current limit is %d kB, requested %d bytes, available %d MB
  • VM protect failed to allocate %d bytes from system, VM Protect %d MB available.
  • Failed to allocate memory under virtual memory protection.

Before, we go further, let us first understand some of the below key terms which are essential before targeting a specific issues

1. Virtual Memory - Virtual Memory is the amount of RAM + SWAP configured on the system.

You can use "free" linux command to view the amount of memory available on the server.

Example output

mdw$ free -g
              total           used       free     shared    buffers     cached
Mem:           126        120          5          0          0        115
-/+ buffers/cache:          4        121
Swap:           63          0         63

In the above output, RAM available on the system is 126 GB, SWAP configured on the system is 63 GB. So, the total Virtual Memory available on the system is 126 + 63 = 189 GB.

Not to confuse, however, there are other parameters which controls the amount of actual virtual memory available for the applications (ex: GPDB) use, viz.

-- vm.overcommit_memory (default value: 2)
-- vm.overcommit_ratio (default value: 50%)

So, essentially the available virtual memory on the system for applications (ex: GPDB) based on the default values of above parameters is calculated as below:

Formula: RAM / vm.overcommit_ratio + SWAP
Example based on free output shared above: 126 / (50 / 100) + 63 = 63 + 63 = 126 GB

Thus, while configuring the application you must consider the upper limit of memory as 126 GB and not 189 GB in the above examples.

For details on these parameters, please refer to the below articles: Linux Overcommit strategies and GPDB/HAWQ

2. gp_vmem_protect_limit

Sets the amount of memory (in number of MBs) that all postgres processes of an active primary segment instance can consume. If a query causes this limit to be exceeded, memory will not be allocated and the query will fail.

Refer to greenplum documentation to understand how to calculate this parameter for a GPDB cluster based on the available virtual memory on the servers.

3. gp_vmem_limit_per_query

This GUC is not exposed (for customer or in documentation), so it has to be used cautiously. The purpose of this GUC is to control the amount of Virtual memory which can be consumed by any individual query. If, any query hits the limit defined by gp_vmem_limit_per_query, it will be cancelled.

Now let's move to review the possible causes and recommendations if a specific error message is seen.

Comments

  • Avatar
    Alex Jiang

    Formula: RAM / vm.overcommit_ratio + SWAP
    Example based on free output shared above: 126 / (50 / 100) + 63 = 63 + 63 = 126 GB

    This part I would belive is wrong. It should be "RAM*vm.overcommit_ratio + swap"

    Edited by Alex Jiang
Powered by Zendesk