Pivotal Knowledge Base

Follow

Greenplum Database start issues and Segment recovery

Environment

Product Version
Pivotal Greenplum (GPDB) 4.3.x
OS RHEL 6.x

Goal

This article is for Greenplum admins to check and fix the common startup issues and guide how to run segment recovery.

Symptom

There are two sections in this article. Section A describes the common startup issues and Section B provides the guidance on segment recovery.

A. Database is not starting up showing error like

20160815:13:33:14:002998 gpstart:mdw:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1 
Command was: 'env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/master/gpseg-1 -l /data/master/gpseg-1/pg_log/startup.log -w -t 600 -o " -p 5432 -b 1 -z 0 --silent-mode=true -i -M master -C -1 -x 690 -c gp_role=utility " start' 
rc=1, stdout='waiting for server to start...... stopped waiting
', stderr='pg_ctl: PID file "/data/master/gpseg-1/postmaster.pid" does not exist
pg_ctl: could not start server
Examine the log output.

Cause

This could be a result of some config changes that has been done in the pg_hba.conf OR postgresql.conf files. If you are aware of some changes made to either of the files, please review them and make sure that the changes are valid. You can test by commenting out the changes and check if gpstart works again.

Resolution

https://discuss.pivotal.io/hc/en-us/articles/204415663-gpstart-fails-to-start-the-database-after-changing-parameters-using-gpconfig-

https://discuss.pivotal.io/hc/en-us/articles/202263253-gpstart-fails-due-to-invalid-IP-mask-found-under-Master-log

Search Pivotal knowledge base for other common startup issues not mentioned above.In case the articles does not help and database is still down then contact Pivotal Support using the logs related to the error.

B. gpstate shows segments down 

Resolution Steps

1. Check using "gpstate -e" about the segments those are down. In the below case all the segments on sdw2 went down because of server replacement.

20160923:15:53:38:007795 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20160923:15:53:38:007795 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.5.1 build 1'
20160923:15:53:38:007795 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20160923:15:53:38:007795 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20160923:15:53:39:007795 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
.. 
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   Current Primary   Port    Mirror   Port
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50000   sdw2-1   40000
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50001   sdw2-1   40001
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50002   sdw2-1   40002
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50003   sdw2-2   40003
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50004   sdw2-2   40004
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50005   sdw2-2   40005
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-Primaries in Change Tracking
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   Current Primary   Port    Change tracking size   Mirror   Port
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw1-2            40003   188 MB                 sdw2-1   50003
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw1-2            40004   219 MB                 sdw2-1   50004
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw1-2            40005   202 MB                 sdw2-1   50005
20160923:15:53:41:007795 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50005   200 MB                 sdw2-2   40005

2. Run recovery.

gprecoverseg -a

If the filesystem for down segments has been formatted OR after performing a segment server replacement with drives, incremental recovery of segments cannot be done and full recovery is needed. Make sure the appropriate filesystem partitions and parent folders (/dataX/primary /dataX/mirror) for the data directories have been created and Greenplum binaries has been installed on the server after filesystem format / OS reimage before running full recovery.

Note: Do not run full recovery if the filesystem has not been formatted and incremental recovery is failing. Check Pivotal knowledge base if there is an article available related to failure else contact Pivotal support using Severity 2 ticket.

gprecoverseg -F

3. Check status of recovery using gpstate -e

gpstate -e 

For this case when recovery gets completed then the status will be like below.In case of only mirror down scenario next step can be skipped and the status will show as described in step 5.

20160923:16:23:30:013261 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20160923:16:23:30:013261 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.5.1 build 1'
20160923:16:23:30:013261 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20160923:16:23:30:013261 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20160923:16:23:31:013261 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
.. 
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   Current Primary   Port    Mirror   Port
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50000   sdw2-1   40000
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50001   sdw2-1   40001
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-2            50002   sdw2-1   40002
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50003   sdw2-2   40003
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50004   sdw2-2   40004
20160923:16:23:34:013261 gpstate:mdw:gpadmin-[INFO]:-   sdw3-1            50005   sdw2-2   40005

4. Rebalance segments using

gprecoverseg -ra

Note: A rebalance will cancel all the running transactions so schedule a window if running jobs cannot be cancelled.

5. Check status of recovery during rebalance using gpstate -e

gpstate -e 

When no segments are left to recover/rebalance then the end status from gpstate -e would be like:

20160923:17:00:08:065488 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20160923:17:00:08:065488 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.5.1 build 1'
20160923:17:00:08:065488 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.5.1 build 1) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on May 14 2015 14:07:14'
20160923:17:00:08:065488 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20160923:17:00:11:065488 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
.. 
20160923:17:00:14:065488 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:17:00:14:065488 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20160923:17:00:14:065488 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20160923:17:00:14:065488 gpstate:mdw:gpadmin-[INFO]:-All segments are running normally

RCA

If root cause analysis is needed for segment down issue then provide the tar archive generated from the below command to Pivotal support. Refer to gpmt for more information:

gpmt gp_log_collector -failed-segs -start 2016-09-23

Note: Change date used above to the one when segments went down

Search Pivotal knowledge base to know more about Greenplum database OR for other issues not mentioned in this article. 

Comments

  • Avatar
    Jeffrey Sdoeung

    You can also get the same error message if you run out of space in /tmp.

Powered by Zendesk