Pivotal Knowledge Base

Follow

GPDB queries fail with "ERROR: insufficient memory reserved for statement"

Environment

Product Version
Pivotal Greenplum (GPDB) 4.3.x

Symptom

When attempting to execute queries in Pivotal Greenplum they immediately fail with the error "ERROR:  insufficient memory reserved for statement".

  • The query itself may appear "simple", such as SELECT count(*) FROM <table> WHERE FALSE;
  • Running EXPLAIN on the query will fail with the error "ERROR:  insufficient memory reserved for statement"
  • The table(s) in the query may be heavily partitioned.
  • gp_resqueue_memory_policy is set to auto or eager_free
  • The query works if gp_resqueue_memory_policy is set to none

Error Message:

Note: These errors are reported directly to the client side application such as psql:

  • When gp_resqueue_memory_policy=eager_free the error reported will be:
ERROR: insufficient memory reserved for statement
  • When gp_resqueue_memory_policy=auto the error reported will be:
ERROR:  insufficient memory reserved for statement (memquota.c:466)
  • When gp_resqueue_memory_policy=none no error will be reported and the query will work.  However, when set to none, memory management is the same as in Greenplum Database releases prior to 4.1.  This can cause additional memory to be required for a query.  For more information please refer to http://gpdb.docs.pivotal.io/4380/guc_config-gp_resqueue_memory_policy.html

Note: These log entries are reported in the Greenplum master server logs $MASTER_DATA_DIRECTORY/pg_log/gpdb<date>.csv:

When gp_resqueue_memory_policy=eager_free:

2016-05-30 15:22:12.698359 PDT,"gpadmin","gpadmin",p447793,th968460064,"[local]",,2016-05-30 15:22:07 PDT,699859,con5280,cmd2,seg-1,,dx208993,x699859,sx1,"ERROR","53000","insufficient memory reserved for statement",,,,,,"select count(*) from statement_mem_repro where false;",0,,"memquota.c",810

When gp_resqueue_memory_policy=auto:

2016-05-30 15:23:12.564400 PDT,"gpadmin","gpadmin",p447793,th968460064,"[local]",,2016-05-30 15:22:07 PDT,699985,con5280,cmd5,seg-1,,dx209116,x699985,sx1,"ERROR","XX000","insufficient memory reserved for statement (memquota.c:466)",,,,,,"select count(*) from statement_mem_repro where false;",0,,"memquota.c",466,"Stack trace:
1 0xb03bda postgres <symbol not found> (elog.c:502)
2 0xb05be8 postgres elog_finish (elog.c:1446)
3 0xb52876 postgres PolicyAutoAssignOperatorMemoryKB (memquota.c:466)
4 0x754c0d postgres ExecutorStart (execMain.c:299)
5 0x99f59d postgres PortalStart (pquery.c:847)
6 0x995c03 postgres <symbol not found> (postgres.c:1767)
7 0x999ba6 postgres PostgresMain (postgres.c:4735)
8 0x8f7abe postgres <symbol not found> (postmaster.c:6658)
9 0x8fa840 postgres PostmasterMain (postmaster.c:7589)
10 0x7fc8bf postgres main (main.c:206)
11 0x3266a1ed5d libc.so.6 __libc_start_main (??:0)
12 0x4c4869 postgres <symbol not found> (??:0)

Cause 

The memory quota algorithm assumes that all the nodes in a plan might need memory and tries to allocate a specified quota (gp_resqueue_memory_policy_auto_fixed_mem) for each operator that is not deemed memory intensive (eg. Scans).

By default the value for gp_resqueue_memory_policy_auto_fixed_mem is set to 100Kb.

In query plans that have a high number of these scan nodes, such as heavily partitioned tables, the planner will error out at the planner stage before sending to any segments and produce the error message "ERROR:  insufficient memory reserved for statement".

RCA 

To illustrate this issue we will take a table with no data in it, but is highly partitioned:

gpadmin=# \d statement_mem_repro
Table "public.statement_mem_repro"
Column | Type | Modifiers
---------+--------------+-----------
id | bigint |
r_year | numeric(4,0) |
r_month | integer |
Number of child tables: 92 (Use \d+ to list them.)
Distributed by: (id)
Partition by: (r_year)

The user querying the data is gpadmin, in the resource queue pg_default, with Memory Limit set to -1 (ie. no limit):

gpadmin=# SELECT * FROM gp_toolkit.gp_resq_role grs JOIN gp_toolkit.gp_resqueue_status grr ON (grr.rsqname=grs.rrrsqname) WHERE rrrolname='gpadmin';
rrrolname | rrrsqname | queueid | rsqname | rsqcountlimit | rsqcountvalue | rsqcostlimit | rsqcostvalue | rsqmemorylimit | rsqmemoryvalue | rsqwaiters | rsqholders
-----------+------------+---------+------------+---------------+---------------+--------------+--------------+----------------+----------------+------------+------------
gpadmin | pg_default | 6055 | pg_default | 20 | 0 | -1 | 0 | -1 | 0 | 0 | 0
(1 row)

Statement Memory is set to default of 125MB:

gpadmin=# show statement_mem;
statement_mem
---------------
125MB
(1 row)

Executing a simple SELECT produces the error:

gpadmin=# SELECT count(*) from statement_mem_repro WHERE FALSE;
ERROR: insufficient memory reserved for statement

Increasing statement_mem to 256MB and re-running succeeds:

gpadmin=# set statement_mem='256MB';
SET
gpadmin=# SELECT count(*) from statement_mem_repro WHERE FALSE;
count
-------
0
(1 row)

However, looking at the EXPLAIN ANALYZE output we see that it should have enough memory:

Slice statistics:
(slice0) Executor memory: 9538K bytes.
(slice1) Executor memory: 17442K bytes avg x 2 workers, 17442K bytes max (seg0).

Setting the GUC explain_memory_verbosity=DETAIL we are able to see that the query planner is allocating a quota of 100Kb to many of the operations, even if the operation issuing as little as 2Kb.  This is not being shown in the default output.

gpadmin=# SET explain_memory_verbosity=DETAIL;
SET
gpadmin=# EXPLAIN ANALYZE SELECT count(*) FROM statement_mem_repro WHERE FALSE; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------
Aggregate (cost=1199.05..1199.06 rows=1 width=8)
Rows out: 1 rows with 64 ms to end, start offset by 28 ms.
Memory: 10K bytes.
-> Gather Motion 2:1 (slice1; segments: 2) (cost=1198.99..1199.03 rows=1 width=8)
Rows out: 2 rows at destination with 60 ms to first row, 64 ms to end, start offset by 28 ms.
slice 1, seg 0
Root: Peak 0K bytes. Quota: 0K bytes.
Top: Peak 0K bytes. Quota: 0K bytes.
Main: Peak 3K bytes. Quota: 0K bytes.
Executor: Peak 1570K bytes. Quota: 0K bytes.
Deserializer: Peak 2168K bytes. Quota: 0K bytes.
Deserializer: Peak 1K bytes. Quota: 0K bytes.
X_Alien: Peak 9K bytes. Quota: 0K bytes.
X_Motion: Peak 4K bytes. Quota: 100K bytes.
X_Agg: Peak 52K bytes. Quota: 100K bytes.
X_Append: Peak 328K bytes. Quota: 100K bytes.
X_Result: Peak 2K bytes. Quota: 100K bytes.
X_TableScan: Peak 34K bytes. Quota: 100K bytes.
X_Result: Peak 2K bytes. Quota: 100K bytes.
X_TableScan: Peak 34K bytes. Quota: 100K bytes.
X_Result: Peak 2K bytes. Quota: 100K bytes.
X_TableScan: Peak 38K bytes. Quota: 100K bytes.
X_Result: Peak 2K bytes. Quota: 100K bytes.
X_TableScan: Peak 34K bytes. Quota: 100K bytes.

<snip>

For our sample table, we have 92 partitions, each with 13 subpartitions or 1196 partitions in total.  In this example each table scan of the partitions is 100Kb and each result is another 100Kb.  1196x(100Kb+100Kb)=239200Kb or almost 240MB just for those operations.

Setting gp_resqueue_memory_policy_auto_fixed_mem to the lowest setting of 50Kb and resetting statement_mem back to default (125MB) allows this to work:

gpadmin=# RESET statement_mem;
RESET
gpadmin=# SHOW statement_mem;
statement_mem
---------------
125MB
(1 row)
gpadmin=# SET gp_resqueue_memory_policy_auto_fixed_mem='50kB';
SET
gpadmin=# SELECT count(*) from statement_mem_repro WHERE FALSE;
count
-------
0
(1 row)

Running EXPLAIN ANALYZE again confirms the quota is now 50K bytes:

gpadmin=# EXPLAIN ANALYZE SELECT count(*) FROM statement_mem_repro WHERE FALSE;
QUERY PLAN---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------
Aggregate (cost=1199.05..1199.06 rows=1 width=8)
Rows out: 1 rows with 50 ms to end, start offset by 31 ms.
Memory: 10K bytes.
-> Gather Motion 2:1 (slice1; segments: 2) (cost=1198.99..1199.03 rows=1 width=8)
Rows out: 2 rows at destination with 48 ms to first row, 50 ms to end, start offset by 31 ms.
slice 1, seg 0
Root: Peak 0K bytes. Quota: 0K bytes.
Top: Peak 0K bytes. Quota: 0K bytes.
Main: Peak 3K bytes. Quota: 0K bytes.
Executor: Peak 1570K bytes. Quota: 0K bytes.
Deserializer: Peak 2168K bytes. Quota: 0K bytes.
Deserializer: Peak 1K bytes. Quota: 0K bytes.
X_Alien: Peak 9K bytes. Quota: 0K bytes.
X_Motion: Peak 4K bytes. Quota: 50K bytes.
X_Agg: Peak 52K bytes. Quota: 50K bytes.
X_Append: Peak 328K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 38K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes.
X_TableScan: Peak 34K bytes. Quota: 50K bytes.
X_Result: Peak 2K bytes. Quota: 50K bytes. <snip>

This particular table is heavily partitioned, which the Pivotal Optimizer is more efficient with.  Resetting all settings we set the optimizer to ON to use the Pivotal Optimizer and the query works with no errors:

gpadmin=# RESET gp_resqueue_memory_policy_auto_fixed_mem;
RESET
gpadmin=# SHOW gp_resqueue_memory_policy_auto_fixed_mem;
gp_resqueue_memory_policy_auto_fixed_mem
------------------------------------------
100kB
(1 row)
gpadmin=# SET optimizer=on;
SET
gpadmin=# SELECT count(*) FROM statement_mem_repro WHERE FALSE;
count
-------
0
(1 row)

Reviewing EXPLAIN output we see that the Pivotal Optimizer does not create a plan that requires any scans for a query such as this:

gpadmin=# EXPLAIN ANALYZE SELECT count(*) FROM statement_mem_repro WHERE FALSE;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Aggregate (cost=0.00..0.00 rows=0 width=8)
Rows out: 1 rows with 0.003 ms to end, start offset by 0.062 ms.
Memory: 18K bytes.
-> Result (cost=0.00..0.00 rows=0 width=1)
One-Time Filter: false
Rows out: 0 rows with 0.001 ms to end, start offset by 0.062 ms.
Memory: 18K bytes.
Slice statistics:
(slice0) Executor memory: 29K bytes. Peak memory: 19216K bytes. Vmem reserved: 22528K bytes.
Statement statistics:
Memory used: 128000K bytes
Settings: optimizer=on
Optimizer status: PQO version 1.625
Total runtime: 0.066 ms
(14 rows)

 

Resolution

As shown in the RCA section there are several resolution paths that will stop this error from occurring including:

  • Setting the GUC gp_resqueue_memory_policy_auto_fixed_mem to a lower value but not so low it produces problems
  • Setting the GUC optimizer=on enables the Pivotal Optimizer which handles many of these memory operations better than the legacy planner.
  • Setting the GUC statement_mem to a higher value
  • Placing the user(s) executing the queries in a Resource Queue with a MEMORY_LIMIT set to an appropriate level

Additional Information 

Comments

Powered by Zendesk