Pivotal Knowledge Base

Follow

Upgrading Pivotal Greenplum (out-of-family): 4.2.x to 4.3.x

Environment

Product Version
Pivotal Greenplum (GPDB) 4.2.x and 4.3.x

Purpose

This document will provide steps to upgrade Pivotal Greenplum Database from 4.2.x version to 4.3.x

Prerequisites and Planning

  1. Check upgrade time: This major upgrade will add the functionality of append-optimized tables and there would be too many architectural changes. The major part of the upgrade time will be spent on modifying append-only relationship metadata. Before planning the upgrade, it is advised to estimate the time needed for this activity. Steps at estimate_time can be used to calculate the time to upgrade.
  2. Hardware Check: Please check the hardware using dcacheck if GPDB is on an appliance (i.e. DCA). For non-DCA systems, please check with your System Administrator and verify that all the discs are in a good state. As a part of the major upgrade, all the mirrors will be marked down and the modifications will be done primarily. So during the process, the cluster would be running on one copy of segment instances. Down mirrors will be needed to rollback in case of upgrade failure. If there are disc issues on any of the servers, then it could be possible that the raid gets corrupted and the cluster cannot be recovered.
  3. Catalog Check: Since this major upgrade would modify the catalogs, it is mandatory to verify all the databases for any inconsistency. You can follow the steps here. Ensure that you follow the article step-by-step and check all the databases using the -A flag (without using -O).
  4. Binary availability: Ensure that the new binaries are there on the master server. You can download that from www.pivotal.io based on the system (DCA / Non-DCA).
  5. Before you begin your upgrade, make sure the master and all segments (data directories and filespace) have free space up to 2GB.
  6. Prior to upgrading your database, Pivotal recommends that you run a pre-upgrade check to verify your that database is healthy by using gpmigrator_mirror --check-only (refer to the "upgrade Execution" section below OR gpmigrator_mirror --help).
  7. Run VACUUM on all databases. Since catalog modifications are done, the upgrade would be faster on a vacuumed catalog.
  8. Optional but strongly recommended: Backup all databases in your Pivotal Greenplum Database system using gpcrondump. See the Pivotal Greenplum Database Administrator Guide for more information on how to backup using gpcrondump. Make sure to secure your backup files in a location outside of your Greenplum data directories.
  9. Important: If you are upgrading to 4.3.6.1 and above, then make sure the gp_interconnect_type must be UDPIFC since UDP has been deprecated from 4.3.6.1.  Versions lower than 4.2.4.0 need to be upgraded to 4.2.4.0 and above and UDPIFC must be set before upgrade.

Checklist

This checklist provides a quick overview of all the steps required for an upgrade from 4.2.x.x to 4.3.x. Instructions on the detailed upgrade are provided below:

Pre-Upgrade Preparation (on your current system)

4.2.x.x system is up and available, log on to your master host as the gpadmin user (Pivotal Greenplum superuser).

  1. Check the hardware
  2. Check the Catalog
  3. Download the binary
  4. Optional step: Run VACUUM on all databases.
  5. Optional step: Remove old server log files from pg_log in your master and segment data directories.
  6. Check and recover any failed segments (gpstate, gprecoverseg).
  7. Copy or preserve any additional folders or files (such as the backup folders).
  8. Inform all database users of the upgrade and lockout time frame.
  9. Make sure the following GUCs are set to default:  
         gp_autostats_mode [Default : ON_NO_STATS]
         gp_autostats_on_change_threshold [Default: 2147483647]

Upgrade Execution

The system will be locked down to all user's activity during the upgrade process. Backup your current database.

  1. Remove the standby master (gpinitstandby -r).
  2. Do a clean shutdown of your current system (gpstop).
  3. Install the new binary.
  4. Update your environment to source the new Pivotal Greenplum Database 4.3.x installation.
  5. Run the upgrade utility (gpmigrator_mirror if you have mirrors, gpmigrator if you do not).
  6. After the upgrade process finishes successfully, the 4.3.x system will be up and running.

Post-Upgrade (on your 4.3 system)

The 4.3.x.x system is up. Ensure the following steps are executed.

  1. Reinitialize your standby master host (gpinitstandby).
  2. Upgrade gpfdist on all of your ETL hosts.
  3. Rebuild any custom modules against your 4.3.x installation.
  4. Download and install all Pivotal Greenplum Database extensions.
  5. Optional step: Install the latest Command Center Console and update your environment to point to the latest Command Center binaries.
  6. Inform all database users of the completed upgrade.

Upgrade Execution

Note: During the upgrade, all client connections to the master will be locked out. Inform all database users of the upgrade and lockout time frame. From this point onward, users should not be allowed on the system until the upgrade is complete. Also, if you are running upgrade again after a failure encountered during catalog upgrade step, then please move gpmigrator folder and mirror_upgrade_state file from $MASTER_DATA_DIRECTORY before starting the next upgrade.

  1. As gpadmin, source the path file from your old 4.2.x.x installation. For example:
$ source /usr/local/greenplum-db-4.2.6.3/greenplum_path.sh 

On a DCA system, the path to the might be similar to /usr/local/GP-4.2.6.3/ greenplum_path.sh depending on the installed version.

  1. Optional but strongly recommended: Backup all databases in your Pivotal Greenplum Database system using gpcrondump. See the Pivotal Greenplum Database Administrator Guide for more information on how to backup using gpcrondump. Make sure to secure your backup files in a location outside of your Greenplum data directories.
  2. If your system has a standby master host configured, remove the standby master from your system configuration. For example:
$ gpinitstandby -r -M fast
  1. Perform a clean shutdown of your current Pivotal Greenplum Database 4.2.x.x system. This example uses the -a option to disable confirmation prompts:
 $ gpstop -af
  1. Install new binaries.

Non-DCA: Install the Pivotal Greenplum Database 4.3 Software Binaries

Important: If you are installing Pivotal Greenplum Database 4.3 on a Pivotal DCA system, see "Install the Greenplum Database 4.3 Software Binaries on DCA Systems" below. This section is for installing Pivotal Greenplum Database 4.3 on non-DCA systems.

  • Download or copy the installer file to the Pivotal Greenplum Database master host from www.pivotal.io. It is a good practice to copy at the same location, where the previous installation exists - default location - /usr/local/.
  • Unzip the installer file and give permissions. For example: 
# unzip greenplum-db-4.3.5.3-PLATFORM.zip
# chmod +x greenplum-db-4.3.5.3-PLATFORM.bin
  • Launch the installer using bash. For example:
# /bin/bash greenplum-db-4.3.5.3-PLATFORM.bin
  • The installer will prompt you to accept the Pivotal Greenplum Database license agreement. Type 'yes' to accept the license agreement.
  • The installer will prompt you to provide an installation path. Press ENTER to accept the default installation path (for example; /usr/local/greenplum-db-4.3.5.3), or enter an absolute path to an install location. You must have written permissions to the location you specify.
  • The installer installs the Pivotal Greenplum Database software and creates a greenplum-db symbolic link on the directory level above your version-specific Pivotal Greenplum installation directory. The symbolic link is used to facilitate patch maintenance and upgrades between versions. The installed location is referred to as $GPHOME.
  • Source the path file from your new 4.3.x.x installation. This example changes to the gpadmin user before sourcing the file:
# su - gpadmin 
# source /usr/local/greenplum-db-4.3.5.3/greenplum_path.sh 
  • Run the gpseginstall utility to install the 4.3.5.3 binaries on all the segment hosts specified in the hostfile. For example: 
$ gpseginstall -f hostfile

-- DCA: Install the Pivotal Greenplum Database 4.3 Software Binaries

Important Note:

Skip this section if you are not installing Pivotal Greenplum Database 4.3 on DCA systems. This section is only for installing Pivotal  Greenplum Database 4.3 on DCA systems. For Non-DCA Greenplum binaries installation refer the section above.

  • Download or copy the installer file to the Pivotal Greenplum Database master host.
  • As root, run the Pivotal DCA installer for 4.3.x.x on the Pivotal Greenplum Database master host and specify the file hostfile that lists all hosts in the cluster. If necessary, copy hostfile to the directory containing the installer before running the installer. This example command runs the installer for Pivotal Greenplum Database 4.3.5.3 after giving execute permissions:
# chmod +x greenplum-db-appliance-4.3.5.3-build-1-RHEL5-x86_64.bin
# ./greenplum-db-appliance-4.3.5.3-build-1-RHEL5-x86_64.bin hostfile

The file hostfile is a text file that lists all hosts in the cluster, one host name per line.

  • Verify if the symbolic link (/usr/local/greenplum-db) is pointing to new installation on all the hosts (master , standby and segment hosts). If it is then you can go to the next step. If not, then update the Pivotal Greenplum Database environment so that it references your new 4.3.x.x installation.

a. Update the greenplum-db symbolic link on the master and standby master to point to the new 4.3.5.3 installation directory.For example(as root):

# rm -rf /usr/local/greenplum-db
# ln -s /usr/local/greenplum-db-4.3.5.3 /usr/local/greenplum-db
# chown -R gpadmin /usr/local/greenplum-db

On a DCA system, the ln command would specify the install directory created by the DCA installer. For example:

# ln -s /usr/local/GP-4.3.5.3 /usr/local/greenplum-db

b. Using gpssh, also update the greenplum-db symbolic link on all of your segment hosts. For example (as root):

# gpssh -f segment_hosts_file
=> rm -rf /usr/local/greenplum-db
=> ln -s /usr/local/greenplum-db-4.3.5.3 /usr/local/greenplum-db
=> chown -R gpadmin /usr/local/greenplum-db
=> exit
  • On a DCA system, the ln command would specify the install directory created by the DCA installer. For example: 
=> ln -s /usr/local/GP-4.3.5.3 /usr/local/greenplum-db
  • Source the path file from your new 4.3.x.x installation. For example:
$ source /usr/local/greenplum-db-4.3.5.3/greenplum_path.sh 

On a DCA system, the path to the file would be similar to /usr/local/GP-4.3.5.3/ greenplum_path.sh.

  1. Optional but recommended: Prior to running the migration, perform a pre-upgrade check to verify that your database is healthy by executing the 4.3.5 version of the migration utility with the --check-only option. The command is run as gpadmin. This example runs the gpmigrator_mirror utility as gpadmin:
$ gpmigrator_mirror --check-only /usr/local/greenplum-db-4.2.6.3 /usr/local/greenplum-db-4.3.5.3 

On a DCA system, the old GPHOME location might be similar to /usr/local/GP-4.2.6.3 (depending on the old installed version) and the new GPHOME location would be similar to /usr/local/ GP-4.3.5.3.

  1. As gpadmin, run the 4.3.x.x version of the migration utility specifying your old and new GPHOME locations. If your system has mirrors, use gpmigrator_mirror. If your system does not have mirrors, use gpmigrator. This will start the upgrade process. It is a good practice to run the below command in nohup since in case of any client server network disconnections it will continue. For example on a system with mirrors:
$ nohup gpmigrator_mirror /usr/local/greenplum-db-4.2.6.3 /usr/local/greenplum-db-4.3.5.3 -a & 

On a DCA system, the old GPHOME location might be similar to /usr/local/GP-4.2.6.3 (depending on the old installed version) and the new GPHOME location would be similar to /usr/local/ GP-4.3.5.3.

If the database is shutdown and there is no activity in the logs; then it could be possible that the $MASTER_DATA_DIRECTORY is too large and getting offline copied as well as the upgrade process will resume once it gets completed.

Note: If you experience issues during the migration process and have active entitlements for Pivotal Greenplum Database that were purchased through Pivotal, contact Pivotal Support. Information for contacting Pivotal Support is at https://support.pivotal.io. Be prepared to provide the following information:

  • A completed Upgrade Procedure.
  • Log output from gpmigrator and gpcheckcat (located in ~/gpAdminLogs)
  1. The migration can take a while to complete. After the migration utility has completed successfully, the Pivotal  Greenplum Database 4.3.x.x system will be running and accepting connections.

Note: After the migration utility has completed, the resynchronization of the mirror segments with the primary segments continues. Even though the system is running, the mirrors are not active until the resynchronization is complete. You can check the status and ETA on recovery completion using the following command:

$ gpstate -e 

Post-Upgrade Steps(on your 4.3.x.x system)

  1. If your system had a standby master host configured, reinitialize your standby master using gpinitstandby:
$ gpinitstandby -s standby_hostname
  1. If your system uses external tables with gpfdist, stop all gpfdist processes on your ETL servers and reinstall gpfdist using the compatible Pivotal Greenplum Database 4.3.x Load Tools package. Application Packages are available at Pivotal Network. For information about gpfdist, see the Pivotal Greenplum Database 4.3 Administrator Guide here.
  2. Rebuild any custom modules against your 4.3.5.3 installation (for example, any shared library files for user-defined functions in $GPHOME/lib). See the operating system documentation and the System Administrator for information about rebuilding and compiling modules such as shared libraries.
  3. Use the Greenplum Database gppkg utility to install  Pivotal Greenplum Database extensions. If you were previously using any Pivotal Greenplum Database extensions such as pgcrypto, PL/R, PL/Java, PL/Perl, and PostGIS, download the corresponding packages from Pivotal Network, and install using this utility. See the Pivotal Greenplum Database 4.3 Utility Guide here for gppkg usage details.
  4. If you want to utilize the Pivotal Greenplum Command Center management tool, install the latest Command Center Console and update your environment variable to point to the latest Command Center binaries (source the gpperfmon_path.sh file from your new installation). See the Greenplum Command Center documentation for information about installing and configuring Greenplum Command Center.

Note: The Greenplum Command Center management tool replaces Pivotal Greenplum Performance Monitor. Command Center Console packages are available in Pivotal Network.

  1. Optional: Check the status of Pivotal Greenplum Database. For example, you can run the Greenplum Database utility gpstate to display status information of running the Pivotal Greenplum Database.
$ gpstate
  1. Inform all database users of the completed upgrade. Inform users to update their environment to source the Pivotal Greenplum Database 4.3.x.x installation (if necessary).

Comments

Powered by Zendesk