Pivotal Knowledge Base

Follow

gpinitsystem, gpstart, or gpstop hangs and eventually fails with error "could not start server"

Environment

  • Pivotal HDB 1.3.x

Symptom

gpinitsystem, gpstart, or gpstop hangs and eventually fails with a generic error

waiting for server to start...................................................................................................................................................................................................................................could not start server

Cause

In this case customers gpadmin account is mounted from a remote user home directory service. This means the HDB master and segments all have the same gpadmin .bash_profile and .bashrc files.

Customer set environmental variable PGHOST to equal the hostname of the hawqmaster and this setting was picked up by default when gpadmin's user environment gets invoked

During the startup process gpsegmentstart.py will be invoked on that segment server which calls pg_ctl as gpadmin user. pg_ctl will start up the segment instance and then call internal function "test_postmaster_connection()" which attempts to use libpq to connect to the segment instance and verify it is up. The test_postmaster_connection() function does not explicitly set PGHOST in the libpq call and therefore pulls it from the environment or defaults to localhost. Given PGHOST was in the gpadmin environment and set to the wrong host the test_postmaster_connection() returns false which results in a segment start failure

Fix

Remove the bad PGHOST setting from gpadmin environment so pg_ctl will automatically pick up localhost as the desired host value for the libpq connection

Comments

Powered by Zendesk