Pivotal Knowledge Base

Follow

Troubleshoot Data not Returned by Greenplum Command Center

Environment

Pivotal Greenplum Database (GPDB) all versions

Introduction

Command center is not populating data from the cluster or from a particular segment. There are some things to be checked and there are logs that can be identified to know the cause.

In this document, we will take a look at some of the basic steps that we can use or know why the command center is not populating the data.

Description

Scenario 1: Command Center is not showing any data.

This can only mean something is wrong on the master segment that is running the command center.

-- Check if the command webservices is running

[gpadmin@mdw faisal]$ ps -ef | grep light | grep -v grep
gpadmin   8692     1  0 07:01 ?        00:00:00 /usr/local/greenplum-cc-web-1.2.2.1-build-71/bin/lighttpd -f /usr/local/greenplum-cc-web-1.2.2.1-build-71/instances/cc_4331/conf/lighttpd.conf -m /usr/local/greenplum-cc-web-1.2.2.1-build-71/lib

If it's not running, you will need to restart the UI using

gpcmdr --start <cc_instance_name> 

If you don't know the instance name, just type in

gpcmdr --status

and it will list all the cc instances in the cluster.

-- Check the status of the command center UI

[gpadmin@mdw faisal]$ gpcmdr --status cc_4331
Greenplum Command Center UI for instance 'cc_4331' - [RUNNING on PORT:  43311]

If it's down, start the command center UI using the same steps as above.

-- Check if the gpmmon or gpsmon process is running on the master

[gpadmin@mdw faisal]$ ps -ef | grep perfmon | grep -v grep
gpadmin   2242  2230  0 06:59 ?        00:00:00 /usr/local/GP-4.3.3.1/bin/gpmmon -D /data/master/fai_4331-1/gpperfmon/conf/gpperfmon.conf -p 4331
gpadmin   2603     1  0 06:59 ?        00:00:00 /usr/local/GP-4.3.3.1/bin/gpsmon -m 0 -l /data/master/fai_4331-1/gpperfmon/logs -v 0 8888

If any of the above processes is not running then please check the log at the location $MASTER_DATA_DIRECTORY/gpperfmon/logs for the file respective to the process with name <process name>_<date>.log and see what is the reason for the failure and then troubleshoot the problem based on the error.

-- Finally, if gpsmon/gpmmon current logs are all good then you will need to check master segment log for more information for that time when the log was generated at gpperfmon logs

-- To restart the gpmmon/gpsmon, if they are down, you will need to restart the database.

Scenario 2 : One of the segments servers data is not being populated on the Command center UI.

-- Check the gpsmon agent is running on all the host or on the problem host, there should be one gpsmon process for each segment server in the cluster.

[gpadmin@mdw faisal]$ gpssh -f hosts
=> ps -ef | grep gpsmon | grep -v grep
[sdw4] gpadmin  15323     1  0 07:03 ?        00:00:00 /usr/local/GP-4.3.3.1/bin/gpsmon -m 0 -l /data2/mirror/fai_43311/gpperfmon -v 0 8888
[sdw3] gpadmin    824     1  0 09:59 ?        00:00:00 /usr/local/GP-4.3.3.1/bin/gpsmon -m 0 -l /data2/mirror/fai_43313/gpperfmon -v 0 8888

-- If any one segment server gpsmon process is not running then please check the gpperfmon logs to understand what happened to the process and why it failed.

The gpperfmon directory will be on any one of the segments that is running on that segment server.

Use the below small shell script to identify the segment that is currently holding the gpperfmon logs, a value of 1 indicates the segment that currently holding the logs.

[gpadmin@sdw3 ~]$ ps -ef | grep silent | grep -v grep | awk '{print "echo ---------------- Checking : " $10 " ------------------- ;" "find " $10 " -name gpperfmon | wc -l"}' | sh
----------------/data1/primary/fai_43311-------------------
0
----------------/data1/primary/fai_43310-------------------
0
----------------/data2/mirror/fai_43312-------------------
0
----------------/data2/mirror/fai_43313-------------------
1

-- Navigate to the location for eg.s above /data2/mirror/fai_43313/gpperfmon and see the gpsmon logs for the failure, if there is no much information or you need more information then check the segment logs at /data2/mirror/fai_43313/pg_log for further information with that specific time when the log was generated for gpsmon

-- Again a restart of the database is required to bring up the gpsmon process.

Scenario 3: gpsmon everything is running, but the data is not populating or indicating issues

-- Check the tables of gpperfmon like health_now etc for knowing the reason why the data is not populating

-- The gpsmon depends healthmond if its DCA, check if the healthmon is failing from /var/log/messages.

-- SNMP might be failing, one such example is described in the documents indicated here.

 

Comments

Powered by Zendesk