Pivotal Knowledge Base

Follow

Troubleshooting: gpstart unable to start segments with reason: 'PG_CTL failed.'

Environment

 Product  Version
 Pivotal Greenplum  All
 OS  RHEL/CentOS

Overview

Typos in configuration parameters, ports already in use and some other different issues can prevent segments from coming up online. In some scenarios, the error mentioned in the title is thrown when running gpstart utility, which is not very informative.

In this article, we will describe the initial troubleshooting process to follow to get more information about this error so that it can be fixed.

As an example, we will provide guidance for troubleshooting Greenplum failure to start one or more segments due to a bad entry in one or more postgresql.conf files across the segments.

Symptoms

The gpstart utility is unable to start some segments and throws an error similar to the following when it runs:

20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:45  FAILED  host:'sdw2.gphd.local' datadir:'/data1/mirror/gpseg20' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:13  FAILED  host:'sdw2.gphd.local' datadir:'/data1/primary/gpseg11' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:49  FAILED  host:'sdw2.gphd.local' datadir:'/data2/mirror/gpseg25' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:44  FAILED  host:'sdw2.gphd.local' datadir:'/data1/mirror/gpseg21' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:12  FAILED  host:'sdw2.gphd.local' datadir:'/data1/primary/gpseg10' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:16  FAILED  host:'sdw2.gphd.local' datadir:'/data2/primary/gpseg14' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:46  FAILED  host:'sdw2.gphd.local' datadir:'/data2/mirror/gpseg29' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:42  FAILED  host:'sdw2.gphd.local' datadir:'/data1/mirror/gpseg7' with reason:'PG_CTL failed.'
20161013:08:11:21:454037 gpstart:mdw:gpadmin-[INFO]:-DBID:43  FAILED  host:'sdw2.gphd.local' datadir:'/data1/mirror/gpseg6' with reason:'PG_CTL failed.'

Depending on the amount of segments that are unable to start, the database might be unable to start at all (if both primary and mirror segments for a specific content are down).

Resolution

Gather more information about the error:

  1. Check the gpstart log file (normally under /home/gpadmin/gpAdminLogs) and understand the error message that is shown. This can normally help finding which specific segments are affected.
  2. Check the startup.log in /pg_logs directory of the affected segment(s) and look for clues on what happened at the time gpstart was run.
  3. Check the segment logs in /pg_logs directory of the affected segment(s) to get more information about what the segment was trying to do before failing to start.

After reviewing those locations and if unsure on how to proceed from there, the first thing to do would be to use the Pivotal Knowledge Base to find knowledge articles that explain the issue at hand and provide a solution. An easy tip to find relevant articles using the search functionality is to copy the error message string that is shown in the logfile.

If Pivotal Support help is needed, attaching the results of the initial analysis performed, and the error messages found in the logs upon ticket creation, will be very useful and speed up the troubleshooting process.

Example: If a bad line was added to one or more postgresql.conf files across the cluster, in the startup.log you could get something similar to the following:

2016-10-13 11:31:59.399552 GMT,,,p278001,th2045376288,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 11:32:25.338118 GMT,,,p278456,th-958068960,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 11:34:50.139285 GMT,,,p279433,th1119868704,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 11:49:37.776982 GMT,,,p283593,th-170842336,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 11:49:38.095468 GMT,,,p283715,th-1851304160,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 11:55:05.388579 GMT,,,p285528,th-2014513376,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 12:01:40.237405 GMT,,,p288805,th-884390112,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,
2016-10-13 12:02:00.935112 GMT,,,p289261,th-577505504,,,,0,,,seg-1,,,,,"FATAL","42601","syntax error in file ""/data1/mirror/gpseg20/postgresql.conf"" line 553, near token ""-""",,,,,,,,"ParseConfigFile","guc-file.l",369,

This means that there is a problem in line 553 in the postgresql.conf file of the affected segment (in this case gpseg20). Fixing that line will allow PG_CTL to successfully start this segment next time gpstart runs.

Comments

Powered by Zendesk