Pivotal Knowledge Base

Follow

Pivotal Greenplum segments go down frequently with OOM errors

Environment

Product Version
 Pivotal Greenplum (GPDB)  4.3.x
 OS  RHEL 6.x
 Others  

Symptom

Segments in a GPDB cluster go down frequently. Lots of "out of memory” messages were returned and many core files dumped out from the backend process.

Error Message:

1. Many of the following messages in segment logs:

2016-10-09 10:56:06.870296 CST,"search58","dp_bidb",p27190,th624351008,"172.28.8.250","31102",2016-10-09 10:53:34 CST,658511701,con885474,cmd949,seg0
,slice620,dx1219377,x658511701,sx1,"LOG","00000","One or more query execution processes ran out of memory on this segment. Logging memory usage.",,,,
,,,0,,,, 2016-10-09 10:56:10.552042 CST,,,p103581,th624351008,,,,0,,,seg-1,,,,,"LOG","00000","could not fork new process for connection: Cannot allocate memory",,,,,,,0,,"postmaster.c",6678, 2016-10-09 10:56:39.373539 CST,,,p103581,th624351008,,,,0,,,seg-1,,,,,"LOG","00000","server process (PID 27190) was terminated by signal 11: Segmentation fault",,,,,,,0,,"postmaster.c",5882,

2. Core files dumped from the backend process and the backtrace contained too many frames (250 frames in this example):

[gpadmin@sdw1 user-processed]$ /usr/bin/gdb /usr/local/greenplum-db/bin/postgres core.postgres.16158.1476060400.11.500.500
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `postgres: port 40001, search58 dp_bidb 172.28.8.250(2383) con47210 seg1 cmd395'.
Program terminated with signal 11, Segmentation fault.
#0 write_message_to_server_log (elevel=15, sqlerrcode=0, message=0x7fff19ec4eb0 "One or more query execution processes ran out of memory on this segment. Logging memory usage.", detail=0x0, hint=0x0, query_text=0x0, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:3089
3089 elog.c: No such file or directory.
in elog.c
Missing separate debuginfos, use: debuginfo-install audit-libs-2.1.3-3.el6.x86_64 glibc-2.12-1.80.el6_3.6.x86_64 libgcc-4.4.6-4.el6.x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 pam-1.1.1-10.el6_2.1.x86_64
(gdb) bt
#0 write_message_to_server_log (elevel=15, sqlerrcode=0, message=0x7fff19ec4eb0 "One or more query execution processes ran out of memory on this segment. Logging memory usage.", detail=0x0, hint=0x0, query_text=0x0, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:3089
#1 0x0000000000abc9eb in write_stderr (fmt=<value optimized out>) at elog.c:3918
#2 0x0000000000abc543 in gp_write_pipe_chunk (elevel=<value optimized out>, sqlerrcode=<value optimized out>, message=<value optimized out>, detail=<value optimized out>, hint=<value optimized out>, query_text=<value optimized out>, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:2570
#3 write_message_to_server_log (elevel=<value optimized out>, sqlerrcode=<value optimized out>, message=<value optimized out>, detail=<value optimized out>, hint=<value optimized out>, query_text=<value optimized out>, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:3214
#4 0x0000000000abc9eb in write_stderr (fmt=<value optimized out>) at elog.c:3918
#5 0x0000000000afc0ed in MemoryContext_LogContextStats (topContext=<value optimized out>, rootContext=0x303cf10, topContextName=<value optimized out>, nameBuffer=0x7fff19ec77e0 "", currentAvailableTop=<value optimized out>, allAllocatedTop=<value optimized out>, allFreedTop=74752, maxHeldTop=<value optimized out>, nChunksTop=<value optimized out>, nBlocksTop=<value optimized out>, nameBufferSize=<value optimized out>) at mcxt.c:609
#6 MemoryContextStats_recur (topContext=<value optimized out>, rootContext=0x303cf10, topContextName=<value optimized out>, nameBuffer=0x7fff19ec77e0 "", currentAvailableTop=<value optimized out>, allAllocatedTop=<value optimized out>, allFreedTop=74752, maxHeldTop=<value optimized out>, nChunksTop=<value optimized out>, nBlocksTop=<value optimized out>, nameBufferSize=<value optimized out>) at mcxt.c:709
#7 0x0000000000afc452 in MemoryContextStats_recur (context=0x303cf10) at mcxt.c:740
#8 MemoryContextStats (context=0x303cf10) at mcxt.c:804
#9 0x0000000000abc554 in gp_write_pipe_chunk (elevel=<value optimized out>, sqlerrcode=<value optimized out>, message=<value optimized out>, detail=<value optimized out>, hint=<value optimized out>, query_text=<value optimized out>, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:2570
#10 write_message_to_server_log (elevel=<value optimized out>, sqlerrcode=<value optimized out>, message=<value optimized out>, detail=<value optimized out>, hint=<value optimized out>, query_text=<value optimized out>, cursorpos=0, internalpos=0, internalquery=0x0, context=0x0, funcname=0x0, show_funcname=0 '\000', filename=0x0, lineno=0, stacktracesize=0, omit_location=1 '\001', send_alert=0 '\000', stacktracearray=0x0, printstack=0 '\000') at elog.c:3214
#11 0x0000000000abc9eb in write_stderr (fmt=<value optimized out>) at elog.c:3918
#12 0x0000000000b00ec8 in MemoryAccountToLog (memoryAccount=<value optimized out>, context=0x7fff19ec9640, depth=<value optimized out>, parentWalkSerial=0, curWalkSerial=434917040) at memaccounting.c:932
#13 0x0000000000aff0e1 in MemoryAccountWalkNode () at memaccounting.c:795
#14 MemoryAccountWalkKids () at memaccounting.c:837
#15 MemoryAccountWalkNode () at memaccounting.c:800 ...... #248 ServerLoop () at postmaster.c:2480
#249 0x00000000008b6d10 in PostmasterMain (argc=15, argv=0x303bd10) at postmaster.c:1557
#250 0x00000000007b6d4f in main (argc=15, argv=0x303bca0) at main.c:206
(gdb)

Cause 

System memory resource was exhausted due to a extremely complicated query plan.

RCA

Although system parameter gp_vmem_protect_limit is configured properly, which means it won't over-subscribe OS memory resource, there was one query with some JOIN operations and many partition tables were engaged, which caused lots of slices generated according to the query plan produced by the legacy planner (optimizer=off).

[gpadmin@scripts[#39622] 39622]$ grep _prt_ q3.plan |more
-> Seq Scan on t_srv_waybill_1_prt_extra wb (cost=0.00..5728589697836.30 rows=848
074 width=71)
-> Seq Scan on t_srv_actual_freight_1_prt_ex
tra fr (cost=0.00..58654.71 rows=1 width=0)
-> Seq Scan on t_srv_actual_freight_1_prt_2
fr (cost=0.00..2895.16 rows=1 width=0)
......

[gpadmin@scripts[#39622] 39622]$ grep _prt_ q3.plan | grep "Seq Scan" | wc -l
1600

[gpadmin@scripts[#39622] 39622]$ grep slice157[0-9] q3.plan
-> Redistribute Motion 16:16 (slice1577; segments: 16) (cost=1308.65..42060275543679.10 rows=1 width=1770)
......

As each slice means a separate backend process, each segment instance would fork over 1500 backend processes. It led to huge amounts of processes forked on the segment host with multiple instances, which finally exhausted OS memory resource. That's why the above error message was seen in the logs, and the backend process crashed with signal 11.

"could not fork new process for connection: Cannot allocate memory"

However, when we enabled ORCA (optimizer=on) and ran the query again, the query plan was much simpler and only a few slices were needed to execute the plan. This was because ORCA would not request one dedicated slice for each partition table involved in the query. 

[gpadmin@scripts[#39622] 39622]$ grep slice q3_new.plan 
-> Redistribute Motion 16:16 (slice12; segments: 16) (cost=0.00..8535892.90 rows=170 width=61)

Resolution

There are several options for solving such an issue:

  1. Optimize the query to reduce the JOIN operations. You can refer to the Greenplum Database Best Practices document for improving the performance of JOIN.
  2. Try enabling the ORCA (set optimizer=on) at the session level before submitting the query.

Additional Information

Comments

Powered by Zendesk