Pivotal Knowledge Base


Workload Manager (WLM) Installation Failure Checklist


  • Pivotal Greenplum Database (GPDB) 4.3.x
  • Operating System- Red Hat Enterprise Linux 5.x, 6.x, and SUSE 11
  • Workload Manager (WLM) 1.7.0, 1.7.1, and 1.7.2


This article describes what to look for when WLM installation fails.


Installation Issues

  • Leftover scratch dir
    • Error message:
getopt: unrecognized option `--func-file=<install-dir>/gp-wlm-data/.scratch/installer/installer.functions'
    • Solution:
      • Make a copy of the scratch before removing (See directory below)
      • Delete scratch before installation
    • Can also be found at:
  • Segments unable to ssh to Master’s $HOSTNAME
    • Error message:
Services took too long to come online; cluster is not healthy

    • Solution:
      • Must be able to SSH to Master's $HOSTNAME from each host in gp_segment_configuration.
      • SSH `hostname -s` & ssh `hostname -f` should work from all hosts in gp_segment_configuraton to itself (Including MDW and SMDW).
  • Does any segment has a different hostname than gp_segment_configuration?
    • Error message:
Some hosts could not connect; cluster is not healthy
    • Solution (2 options):
      • Option 1: Change node hostname according to gp_segment_configuration.
      • Option 2: Run below steps on failed node.
cd /home/gpadmin/gp-wlm/etc/rabbitmq/current
ln -sf smdw/rabbitmq.config rabbitmq.config
  • If number of nodes is > 40 and master having different entry in gp_segment_configuration
    • Error message:
Unable to locate host information for host <hostname>
    • Solution:
      • Make gp_segment_configuration & hostname the same for master
  • Previous uninstall not clean
    • Error Message (Not all times this would be an unclean uninstall):
Unable to find hostname in existing bootstrap database
    • Solution:
      • Cleanup previous install and check no existing services are running by following below steps
gpssh -f hostfile (make sure hostfile has both mdw and smdw in it)
ps -ef | grep wlm on each host
killall <process-name> (For processes returned above)
Backup <install-dir>/gp-wlm-data/<timestamp>/ for mdw, failed node and one successful node
cd <wlm-install-dir>
rm -rf gp-wlm-data
rm -rf gp-wlm

RCA Log Collection (Collect below files from master, failed nodes and any one not failed node)

hostname -s
hostname -f
Gather_cluster_logs (Might sometimes not work)
gp_segment_configuration (Only on master)
<install_dir>/gp-wlm/sbin/rabbitmqctl cluster_status


  • Avatar
    Scott Gai

    For error "Unable to locate host information for host ", "hostname -f" output on each host (master and segments) should match short name of hostname field in gp_segment_configuration.
    Besides updating gp_segment_configuration, alternative solution is to update /etc/hosts to make them match. The first name in each entry of /etc/hosts will be output of "hostname -f". For example,

    [gpadmin@f12agpdb01 ~]$ hostname -s
    [gpadmin@f12agpdb01 ~]$ hostname -f
    [gpadmin@f12agpdb01 ~]$ hostname --long

    [gpadmin@f12agpdb01 ~]$ grep mdw-1 /etc/hosts mdw-1 mdw f12agpdb01.umc.com F12AGPDB01


    Refer to ticket #55284

    Edited by Scott Gai
  • Avatar
    Scott Gai

    encountered another issue with WLM installation and summarized it with a new KB article.

    Just FYI

Powered by Zendesk