Pivotal Knowledge Base

Follow

How to replicate table from a database/cluster to different database/cluster

Goal

Is to provide some easy method to move table within clusters or database without the hassle of taking backup and restore , it very useful if any developers wants a replica of the original tables for their development and testing on different/another database.

Note : Please verify the script on a test cluster , before running it on production .

Solution

The script that you would need to replicate the table quickly to another database is

pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>

NOTE: If you wish you replicate the entire database , would recommend to have a look at the link here

Example

Let take a simple example to understand the above example.

  • Firstly Creating database / table / loading data into it / generating source database.

-- Source database

gpadmin=# create database d1;
CREATE DATABASE
Time: 8855.715 ms
gpadmin=# \c d1
You are now connected to database "d1" as user "gpadmin".

-- Creating the table to replicate and loading data to it.

d1=# create table t1 as select * from pg_attribute;
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 'attrelid' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
SELECT 2998
Time: 1705.794 ms

-- Creating the destination database.

d1=# create database d2;
CREATE DATABASE
Time: 10753.638 ms
d1=# \q
  • Running the command to backup / load data simultaneously

As you can see the command took the backup and loaded them simultaneously

gpadmin:Fullrack@mdw $ pg_dump -C -t t1 d1 | psql d2
Timing is on.
Pager usage is off.
SET
Time: 184.239 ms
SET
Time: 0.740 ms
SET
Time: 0.677 ms
SET
Time: 0.652 ms
SET
Time: 0.598 ms
SET
Time: 0.635 ms
SET
Time: 1.565 ms
SET
Time: 0.640 ms
SET
Time: 0.601 ms
CREATE TABLE
Time: 66.210 ms
ALTER TABLE
Time: 18.538 ms
Time: 1247.149 ms

PLEASE NOTE: I'm loading this to different database on the same cluster so many parameter mentioned above was removed as not needed here, if you wish to replicate them on a different cluster ensure you have the pg_hba.conf updated in the source location with the correct information so that the psql can connect to the source host and database without any hassle.

  • Cross verification

Cross verification shows everything is good and the table is loaded on the destination database.

gpadmin:Fullrack@mdw $ psql d2
Timing is on.
Pager usage is off.
psql (8.2.15)
Type "help" for help.
d2=# \d
             List of relations
 Schema | Name | Type  |  Owner  | Storage
--------+------+-------+---------+---------
 public | t1   | table | gpadmin | heap
(1 row)
d2=# select count(*) from t1;
 count
-------
  2998
(1 row)
Time: 82.246 ms

TIP

  • If the source table has many grant permission then the pg_dump also will run them on the destination destination , to eliminate the grant ( permission part) you can use the command such as below.
pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | egrep -v "GRANT|REVOKE" | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>
  • To load just the data use pg_dump with -a option.
pg_dump -a -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>
  • For more option on what you can use along with the above mentioned command you can refer to the help from pg_dump
pg_dump --help

Alternative Method

If the data set is very huge, check gptransfer utility, here is an example on how to use it.

You can also use psql to do the same task , make sure the DDL of the table is available on the destination.

psql -c "copy (select * from <source-table>) to stdout" <source-database> | psql -c "copy <destination-table> from stdin" <destination database>

Related articles

Comments

Powered by Zendesk