Pivotal Greenplum Database (GPDB) 4.2 and 4.3
GPDB 4.2.x to 4.3.x upgrade involves converting all Append Only (AO) tables to Append Optimized tables. This includes creating some subobjects for each AO table. This process takes time depending on the number of AO objects.
The documented time estimation for this process is described here: http://gpdb.docs.pivotal.io/tools/pdf/estimate_42_to_43_migrate_time.pdf
The actual function that does the estimating is linked here: http://gpdb.docs.pivotal.io/tools/estimate_42_to_43_migrate_time.zip
This function needs to be installed and run individually on every database and the total sum is the total upgrade estimate. See here for details about the function: https://support.pivotal.io/hc/en-us/articles/205511208-Estimating-AO-Object-Migration-Time-from-GPDB-4-2-x-to-GPDB-4-3-x
- The estimate provided is pretty accurate if the following details are met:
- The number of primary segments is 600 or lower.
- The catalog is VACUUM and ANALYZED before the upgrade (the catalog bloat does not affect the estimate of the function but affects the actual upgrade timeline)
- The estimate provided is not very accurate for clusters with more than 600 segments. In these situations, the timeline increases exponentially. Therefore in those cases please use the alternative formula for estimation
An alternative formula for the large clusters:
Estimated time in seconds = N * t * O + F
- N -> number of AO tables on the cluster
- T -> average time for create table operation (simple heap table) in seconds
- O -> number of operations (constant = 7)
- F -> fixed cost of 1 hour (3600 seconds)
Number of AO tables: 15289
Time per create-table: 1.64 sec
Num of operations per AO Table: 7
Fixed cost: 1 hour
15289 * 1.64 * 7 + 3600= 179117.72 sec = 49.75 hours (about 50 hours)
Run the following Pre-upgrade checks:
- Upgrade latest DCA ISO before upgrading the database. If it is not possible to upgrade DCA ISO, at least, apply the hotfix.
- Run a gpcheckcat and ensure there are no catalog issues prior to an upgrade.
- Take a full backup of the cluster.
- Run dcacheck prior to the upgrade to ensure there are no hardware issues.
- It is recommended to run vacuum analyze on the catalog prior to the upgrade.
- Also, update pg_hba.conf file to not allow any other connections other than gpadmin.
Hotfix for DCA V2: https://support.emc.com/
Hotfix for DCA V1: https://support.emc.com/
During upgrade after the checks gets completed the database will be stopped and a check (stat) and copy process will be done on for $MASTER_DATA_DIRECTORY. During "stat" no log messages will be there and it seems like the process is in a hung situation. You can check the "strace" on the "python" process running gpmigrator and also check the "$MASTER_DATA_DIRECTORY /gpmigratorstatecheck" file to get the current state.
The strace will show the stat on files. If there are large numbers of files in $MASTER_DATA_DIRECTORY then stat will take time and add on to the upgrade time.
Also, if the customer will use ORCA its recommended to run analyze on the root partitions post upgrade.