Pivotal Knowledge Base

Follow

HowTo - Use gpdbrestore tool with gp_dump created dump file and vice versa.

Goal

This guide would give you steps on how you can use the dump created from gp_dump to be used with gpdbrestore and vice versa.

Solution

As you are aware from the admin guide. gp_dump created backup file uses gp_restore to restore it, gpcrondump created backup files uses gpdbrestore to restore the contents. gpcrondump internally uses gp_dump to take its backup, So theoretically both gpdbrestore and gp_restore can be used alternatively since the format should be the same , below are the steps on how to make a gpdbrestore restore the dump created from gp_dump..

-- Step 1:

  • Create a dump file using gp_dump.
    gpadmin:Fullrack@mdw $ gp_dump --gp-c test
    20130515:03:06:41|gp_dump-[INFO]:-Read params:<empty>
    20130515:03:06:41|gp_dump-[INFO]:-Command line options analyzed.
    20130515:03:06:41|gp_dump-[INFO]:-Connecting to master database on host localhost port 8532 database test.
    20130515:03:06:41|gp_dump-[INFO]:-Reading Greenplum Database configuration info from master database.
    20130515:03:06:41|gp_dump-[INFO]:-Preparing to dump the following segments:
    20130515:03:06:41|gp_dump-[INFO]:-Segment 23 (dbid 25)
    ........
    ..........
    ..........
    20130515:03:06:43|gp_dump-[INFO]:-backup succeeded for dbid 5 on host sdw10
    20130515:03:06:43|gp_dump-[INFO]:-backup succeeded for dbid 7 on host sdw10
    20130515:03:06:43|gp_dump-[INFO]:-backup succeeded for dbid 23 on host sdw9
    20130515:03:06:43|gp_dump-[INFO]:-backup succeeded for dbid 22 on host sdw9
    20130515:03:06:43|gp_dump-[INFO]:-All remote gp_dump_agent programs are finished.
    20130515:03:06:43|gp_dump-[INFO]:-Report results also written to /data/master_fai4240_3/gpseg-1/gp_dump_20130515030641.rpt.
  • So as you can see the dump files are created at /data/master_fai4240_3/gpseg-1 at master.
  • Similarly on the primary segments the location should be  <primary home location>
  • You can override the default location of the backup using --gp-d while running the gp_dump backup .

-- Step 2:

  • Now since the gpdbrestore identify the location at $MASTER_DATA_DIRECTORY/db_dumps/<timestamp> to restore , the above location created by gp_dump will not work. if you attempt to restore using gpdbrestore , you might encounter
    gpadmin:Fullrack@mdw $ gpdbrestore -t 20130515030641
    20130515:03:38:21:025699 gpdbrestore:mdw:gpadmin-[INFO]:-Starting gpdbrestore with args: -t 20130515030641
    20130515:03:38:21:025699 gpdbrestore:mdw:gpadmin-[ERROR]:-gpdbrestore error: Dump file gp_cdatabase_1_1_20130515030641 does not exist on Master
    gpadmin:Fullrack@mdw $
  • So now we will have to tweak the location a little bit to ensure that our gpdbrestore script can identify the dump . First ensure that gp_dump has created the below three files at the master location.
    gpadmin:Fullrack@mdw $ ls -ltr | grep 20130515030641
    -rw------- 1 gpadmin root   110 May 15 03:06 gp_cdatabase_1_1_20130515030641
    -rw------- 1 gpadmin root   421 May 15 03:06 gp_dump_1_1_20130515030641.gz
    -rw------- 1 gpadmin root   190 May 15 03:06 gp_dump_1_1_20130515030641_post_data.gz

      where

           gp_cdatabase_1_1_20130515030641 - contain the script to recreate the database

           gp_dump_1_1_20130515030641.gz - Contain the schema dump

           gp_dump_1_1_20130515030641_post_data.gz - contain the other objects ( like index etc ) related to the table to be created after restoring the data .

  • If the above files are not created by gp_dump then use the below command to create them at the master. NOTE : In this case ( where the above 3 files are missing ) you will need to create the database manually and restore the schema definition from somewhere using previous dump or the table definition should be available in the database being restored.
    touch gp_cdatabase_1_1_<timestamp_of_dump_at_primary_location>
    touch gp_dump_1_1__<timestamp_of_dump_at_primary_location>
    touch gp_dump_1_1_<timestamp_of_dump_at_primary_location>_post_data
  • If at the primary location the dumps are zipped then you will need to zip the two created files ( below ) as well.
    gzip  gp_dump_1_1__<timestamp_of_dump_at_primary_location>
    gzip  gp_dump_1_1_<timestamp_of_dump_at_primary_location>_post_data
  • Now login to the database using psql and run the below query .
    select
        'ssh '|| hostname || ' ln -s ' ||  fselocation || '/' ||
        fselocation || '/db_dumps/20130515'
    from
        gp_segment_configuration gpseg ,
        pg_filespace_entry fse
    where
        gpseg.dbid=fse.fsedbid
    and gpseg.role='p'
    and gpseg.content != -1;

where the first "fselocation" is the location where the dump are located , if its somewhere else in your case , please modify the above query and " /db_dumps/<date> " is the destination to where the gpdbrestore will look for the files , so make sure you enter the correct date of the dump Once the script is run at you end , you will end up with something like this.

    ssh sdw10 ln -s /data1/primary_fai4240_3/gpseg0/ /data1/primary_fai4240_3/gpseg0/db_dumps/20130515
    ssh sdw11 ln -s /data1/primary_fai4240_3/gpseg6/ /data1/primary_fai4240_3/gpseg6/db_dumps/20130515
    ssh sdw12 ln -s /data1/primary_fai4240_3/gpseg12/ /data1/primary_fai4240_3/gpseg12/db_dumps/20130515
    ....
    .....
  • Create a sh file and run them from master
  • on master also run" ln -s /data/master_fai4240_3/gpseg-1 /data/master_fai4240_3/gpseg-1/db_dumps/20130515
  • I have used 20130515 as the date folder , since my command is going to restore for the timestamp 20130515030641 , so that initial 8 numbers tells you the date under which gpdbrestore will be look for the dump.

-- Step 3:

  • So once we have the setup ready , run the command to restore using the gpdbrestore.
    gpadmin:Fullrack@mdw $ gpdbrestore -t 20130515030641
    20130515:03:20:32:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Starting gpdbrestore with args: -t 20130515030641
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-------------------------------------------
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Greenplum database restore parameters
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-------------------------------------------
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Restore type               = Full Database
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Database to be restored    = test
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Drop and re-create db      = Off
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Restore method             = Restore specific timestamp
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Restore timestamp          = 20130515030641
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Restore compressed dump    = On
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Restore global objects     = Off
    20130515:03:20:33:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Array fault tolerance      = n
     
    Continue with Greenplum restore Yy|Nn (default=N):
    > y
    20130515:03:20:38:021103 gpdbrestore:mdw:gpadmin-[INFO]:-gp_restore -i -h mdw -p 8532 -U gpadmin --gp-d=db_dumps/20130515 --gp-i --gp-k=20130515030641 --gp-r=db_dumps/20130515 --gp-l=p --gp-c -d test
    20130515:03:20:44:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Commencing analyze of test database, please wait
    20130515:03:20:53:021103 gpdbrestore:mdw:gpadmin-[INFO]:-Analyze of test completed without error  

Please ensure you use the " -t " option , since the above tweak cannot be used with option " -s " , since when gpdbrestore is used with " -s "it will be looking for the previous generated backup with gpcrondump.

Similarly you can use the gpcrondump dump file with gp_restore with reversing the above steps to the location where gp_restore understands and finds its dump .

Comments

Powered by Zendesk