Pivotal Knowledge Base

Follow

How to remove Pivotal HDB(HDB) standby master while keep HDB cluster Controllable with Ambari

Environment

  • PHD 3.x
  • Pivotal HDB(HDB) 1.3.x
  • AMBARI 1.7.x

Problem

Currently there is no way to remove HDB standby master on Ambari web UI after it's initialized. But sometimes it may be required to remove/disable standby master. So the only option to do that is to run "gpinitsystem -r" on master manually.

However another problem will arise if remove standby master with "gpinitsystem -r", which is HDB cluster is not able to be operated (stop/start) on Ambar web UI any longer. The error messages will look as follows

/data/hawq/master/gpseg-1/postmaster.opts is not found, kind validate if master data directory exists along with postmaster.opts file on both hawq master and standby servers.
Please execute database start operation from active hawq master until its fixed, as without postmaster.opts available on both master servers active master cannot be identified.\nNote: postmaster.opts will be automatically created during hawq startup.

Cause

Ambari finds the active master by comparing the postmaster.opts files in both master and standby. If the files are missing, in order to not lose data Ambari doesn't assume an active master and leaves it to the user to start/stop from command line. 

As HDB standby master is removed by running "gpinitsystem -r" manually on HDB master host, so the master data directory, where postmaster.opts file resides, will be removed from HDB standby master host too. And Ambari is not aware of the removal of standby master, so it still attempts to check file postmaster.opts to determine which one is the active master and fails finally by design.

Solution

Standby master will be made optional in HDB 2.0 plugin for Ambari, so there won't be such issue from HDB 2.0 plugin onwards.

Before the solution is ready, use following workaround to make HDB cluster still be able to be operated with Ambari.

1. Backup postmaster.opts on standby master before execution of "gpinitsystem -r"

[gpadmin@hdm1 ~]$ cp /data/hawq/master/gpseg-1/postmaster.opts ~/

2. Make master data directory again on standby master host after "gpinitsystem -r" if necessary

3. Copy file postmaster.opts back to master data directory and change its mode to read-only

[gpadmin@hdm1 ~]$ cp ~/postmaster.opts /data/hawq/master/gpseg-1/
[gpadmin@hdm1 ~]$ chmod -w /data/hawq/master/gpseg-1/postmaster.opts

4. Try stop/start HDB cluster on AMBARI web UI with

"Ambari" -> "HDB" -> "HDB Master" -> "Start/Stop" pull down -> "OK"

 Reference

Internal JIRA AMBR-513

Comments

Powered by Zendesk