Pivotal Knowledge Base

Follow

GPPERFMON database's log_alert_history table is empty

Environment

Pivotal Greenplum: 4.3.x

OS: RHEL 6.x

Symptom

The user never truncates data in the gpperfmon database, but no records are found in the log_alert_history table.

gpperfmon=# select * from log_alert_history; 
logtime | loguser | logdatabase | logpid | logthread | loghost | logport | logsessiontime | logtransaction | logsession | logcmdcount | logsegment | logslice | logdistxact | loglocalxact | logsubxact | logseverity | logstate | logmessa 
---------+---------+-------------+--------+-----------+---------+---------+----------------+----------------+------------+-------------+------------+----------+-------------+--------------+------------+-------------+----------+--------- 
(0 rows)

Time: 7214.505 ms

Please refer to this article to understand how the log_alert_history table was generated.

The Greenplum Database system logger writes alert logs in the $MASTER_DATA_DIRECTORY/gpperfmon/logs directory.

The agent process (gpmmon) performs the following steps to consolidate log files and load them into the gpperfmon database: 

  1. Gather all of the gpdb-alert-* files in the logs directory (except the latest, which the syslogger has open and is writing to) into a single file, alert_log_stage
  2. Load the alert_log_stage file into the log_alert_history table in the gpperfmon database
  3. Truncate the alert_log_stage file
  4. Remove all of the gp-alert-* files, except the latest

Thus, the logger process should be writing to the gpdb-alert-* files, and if we use the lsof command to check the logger process in the master, we should be able to see that:

[gpadmin@gpdb-sandbox logs]$ ps -ef | grep 13388
gpadmin  13388 13387  0 Apr16 ?        00:00:21 postgres: port  5432, master logger process
gpadmin  17808 17732  0 09:59 pts/0    00:00:00 grep 13388

[gpadmin@gpdb-sandbox logs]$ lsof -p 13388
COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
postgres 13388 gpadmin  cwd    DIR  253,0     4096 521220 /gpdata/master/gpseg-1
...
...
postgres 13388 gpadmin    1w   CHR    1,3      0t0   3845 /dev/null
postgres 13388 gpadmin    2w   CHR    1,3      0t0   3845 /dev/null
postgres 13388 gpadmin    3w   REG  253,0   164222 531527 /gpdata/master/gpseg-1/gpperfmon/logs/gpdb-alert-2017-04-25_000000.csv   <---- But in customer's cluster, we may not see the process writing to this .csv file 

Resolution

For situations like above, most of the time, it is the configuration or writing process issues causing the writer process not writing to gpdb-alert-* file. Please verify your configuration to be same as below and reboot Greenplum. This should resolve the issue:

gpadmin@hitw-gc-uat-dm-mdw-1 ~]$ gpconfig -s gp_enable_gpperfmon
Values on all segments are consistent
GUC          : gp_enable_gpperfmon
Master  value: on
Segment value: on

[gpadmin@gpdb-sandbox gpseg-1]$ gpconfig -s gpperfmon_log_alert_level

Values on all segments are consistent

GUC          : gpperfmon_log_alert_level

Master  value: warning

Segment value: warning

Please note that when you use the gpconfig -s to query the gp_enable_gpperfmon, it might show as off in some versions of Greenplum, but as long as the value in the postgresql.conf shows the value as on, it would work fine.

Comments

Powered by Zendesk