- PHD 188.8.131.52
/var/lib/pgsql disk space usage is 10GB and growing on the Pivotal Command Center node
/var/lib/psql - 11 G
PCC will continuously collect hadoop and system metrics from the installed clusters. Depending on the size of the cluster regular maintenance may be necessary keep the database size under control
- Connect to the database and identify what tables are using the most space
[gpadmin@pccadmin ~]$ psql -U postgres gphdmgr -p 10432 gphdmgr=# SELECT tablename ,pg_size_pretty(pg_total_relation_size(tablename::text)) AS size FROM pg_tables WHERE schemaname = 'public' order by 1 DESC; tablename | size ----------------------------------+------------ data_node_report | 10 GB system_history_agg_all_hosts_4h | 240 kB system_history_agg_all_hosts_30m | 1560 kB system_history_agg_all_hosts_1m | 2656 kB system_history_agg_all_hosts_1h | 816 kB system_history_agg_all_hosts_1d | 88 kB system_history | 14 MB stack_properties | 32 kB stack | 32 kB
- In this case data_node_report table is 10GB and is used for collecting HDFS usage statistics. This table is safe to truncate and will not impact icm or PCC operations. This will simply purge the historical statistics from the database. The following list of tables are safe execute steps 3-6. Any table not in this list should never be modified, deleted, or changed without consulting pivotal support.
app Yarn Application details app_attempt Yarn Container details for yarn application data_node_report HDFS data usage statistics hbase_metrics Hbase Operation metrics job Mapreduce History stats for completed applications job_attempt Mapreduce Containers stats job_tracker_metrics Resource Manager statistics namenode_metrics Namenode RPC and load statistics queue_metrics Yarn Scheduler statistics system_history CPU, Mem, IO stats for each host
NOTE: Please take caution when proceeding through steps 3 and 6. Truncating the wrong table could result in disabling icm_client operations. In addition truncating the tables in the above list will result in lost historical cluster statistics
- First backup the database to a safe location incase any mistakes are made during this maintenance activity
[root@pccadmin ~]# service commander stop Stopping PHD Manager Services... nodeagent is not running Stopping Jetty [ OK ] Stopping httpd: [ OK ] Stopping Pivotal Command Center HTTPS [ OK ] Stopping Pivotal Command Center Background Worker [ OK ] [root@pccadmin ~]# service postgresql stop Stopping postgresql service: [ OK ] [root@pccadmin ~]# tar -cf /backup/pgsql.backup.12-01-2014.tar /var/lib/pgsql
- After backup is completed start the database and truncate the large tables
[root@pccadmin ~]# service postgresql start Starting postgresql service: [ OK ] [root@pccadmin ~]# psql -U postgres -p 10432 gphdmgr gphdmgr=# TRUNCATE TABLE data_node_report; TRUNCATE TABLE
- Vacuum the database. Please DO NOT CANCEL this operation once triggered.
gphdmgr=# VACUUM FULL; VACUUM
- Remember to REINDEX the database to clean up related index tables and any toast tables
gphdmgr=# REINDEX DATABASE gphdmgr; NOTICE: table "pg_class" was reindexed NOTICE: table "pg_type" was reindexed . . REINDEX gphdmgr=#
- Confirm database usage is decreased
gphdmgr=# SELECT 'data_node_report' ,pg_size_pretty(pg_total_relation_size('data_node_report')) AS size; ?column? | size ------------------+------- data_node_report | 16 kB
- Start command center services
[root@pccadmin ~]# service commander start Starting PHD Manager Services... Starting httpd: [ OK ] Preparing nodeagent: [ OK ] Starting nodeagent: [ OK ] Starting Jetty [ OK ] Waiting for Jetty to mount applications [ OK ] Starting Pivotal Command Center HTTPS [ OK ] Starting Pivotal Command Center Background Worker [ OK ]
- During this procedure we disabled postgresql and that might have caused the nmon process to shutdown on the pivotal command center node. To ensure nmon is still running run these steps from the pivotal command center node
[root@pccadmin ~]# service nmon start [root@pccadmin ~]# massh <hostfile> verbose "service nmon restart" [root@pccadmin ~]# massh <hostfile> verbose "service nmon status"