Pivotal Knowledge Base


How to Replicate Table from a Database or Cluster to different Database or Cluster



Is to provide some easy method to move table within clusters or database without the hassle of taking backup and restore, it very useful if any developers want a replica of the original tables for their development and testing on different/another database.

Note- Please verify the script on a test cluster , before running it on production .


The script that you would need to replicate the table quickly to another database is

pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>

Note- If you wish you replicate the entire database , would recommend to have a look at the link here


Let take a simple example to understand the above example.

  • Firstly Creating database / table / loading data into it / generating source database.

-- Source database

gpadmin=# create database d1;
Time: 8855.715 ms
gpadmin=# \c d1
You are now connected to database "d1" as user "gpadmin".

-- Creating the table to replicate and loading data to it.

d1=# create table t1 as select * from pg_attribute;
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 'attrelid' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
Time: 1705.794 ms

-- Creating the destination database.

d1=# create database d2;
Time: 10753.638 ms
d1=# \q
  • Running the command to backup / load data simultaneously

As you can see the command took the backup and loaded them simultaneously

gpadmin:Fullrack@mdw $ pg_dump -C -t t1 d1 | psql d2
Timing is on.
Pager usage is off.
Time: 184.239 ms
Time: 0.740 ms
Time: 0.677 ms
Time: 0.652 ms
Time: 0.598 ms
Time: 0.635 ms
Time: 1.565 ms
Time: 0.640 ms
Time: 0.601 ms
Time: 66.210 ms
Time: 18.538 ms
Time: 1247.149 ms

Note- This is loaded to a different database on the same cluster. So, many parameters that are mentioned above were removed as they are not needed here. If you wish to replicate them on a different cluster ensure you have the pg_hba.conf updated in the source location with the correct information so that the psql can connect to the source host and database without any hassle.

  • Cross verification

Cross verification shows everything is good and the table is loaded on the destination database.

gpadmin:Fullrack@mdw $ psql d2
Timing is on.
Pager usage is off.
psql (8.2.15)
Type "help" for help.
d2=# \d
             List of relations
 Schema | Name | Type  |  Owner  | Storage
 public | t1   | table | gpadmin | heap
(1 row)
d2=# select count(*) from t1;
(1 row)
Time: 82.246 ms


  • If the source table has many grant permission then the pg_dump also will run them on the destination destination , to eliminate the grant ( permission part) you can use the command such as below.
pg_dump -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | egrep -v "GRANT|REVOKE" | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>
  • To load just the data use pg_dump with -a option.
pg_dump -a -t <schema-name.source-table-name> -h <source-host-name> <source-database-name> | psql -h <destination-host-name> -U <destination-user> -p <destination-port> <destination-database>
  • For more option on what you can use along with the above mentioned command you can refer to the help from pg_dump
pg_dump --help

Additional Information

If the data set is very huge, check gptransfer utility, here is an example on how to use it.

You can also use psql to do the same task , make sure the DDL of the table is available on the destination.

psql -c "copy (select * from <source-table>) to stdout" <source-database> | psql -c "copy <destination-table> from stdin" <destination database>

Related articles


Powered by Zendesk