Pivotal Knowledge Base

Follow

How to run recovery for failed segments on Greenplum Database or HAWQ

Goal

This article will cover the steps required for performing recovery for the segments which are currently marked down.

Solution

For more details on performing analysis, why the segments went down please refer to the article at link : Analysis before recovering segments

Note: All the below commands are required to be run from the master server as gpadmin user.

  1. Identify the segments which are currently marked down
gpstate -e
  1. Initiate incremental recover or full recovery depending on the requirement, prefer running incremental recovery first and if it fails then you would need to review the segment logs to understand the cause of recovery failure and decide if full recovery is required.
gprecoverseg -a  ( Incremental recovery) 

OR

gprecoverseg -aF  (Full recovery) 
  1. The above command will return the control to the user, usually it should occur in around 10-15 minutes or even less. However, if there is high amount of change tracking logs or huge number of create / drop / alter statement have been performed on the database since the mirror was down, it can take time.
  2. Once the control is returned, you can view the progress of resynchronization activity using:
gpstate -e
  1. (Optional) If gpstate -e reports that there are segments which switched mirror roles, you will need to run a rebalance operation to bring the segments into their preferred state.
gprecoverseg -r

OR

gpstop -raf

Note: HAWQ does not have mirrors, so this step is only applicable for GPDB.

Comments

  • Avatar
    Steve Jones

    I've been told by support that gprecoverseg -r is not safe, and that a restart should be used. Has this issue been resolved?

  • Avatar
    Faisal Ali

    Hello Steve,

    I guess you are referring to the issue mentioned in the article here

    https://support.pivotal.io/hc/en-us/articles/202459726-Persistent-table-corruption-during-incremental-recovery

    Yes that has been resolved long back , if you are on the effected version then restart is best to avoid any catalog corruption.

    PLEASE NOTE: during rebalance the existing running queries would fail since there is no transaction failover yet on greenplum , so its recommended to do the rebalance when the system is quite or when the system is not running any critical jobs.

Powered by Zendesk