Pivotal Knowledge Base


GPDBRESTORE Failed to Restore Some Table(s): "ERROR , 42701 , Duplicate Column" in Master Logs


 Product  Version
 Pivotal Greenplum or higher
 Other  DCA V1/V2


After restoring a Greenplum Database, the user might find that there are one or more tables missing from the restored database. 

The master logs for these tables indicate that the table failed to create due to "ERROR,""42701","duplicate column:

2014-04-15 21:06:07.026452 CDT,"gpadmin","xxxxxxx",p561596,th198993664,"","25843",2014-04-15 21:04:11 CDT,411289,con88846,cmd7705,seg-1,,dx316034,x411289,sx1,"ERROR","42701","duplicate column ""acct_id"" in DISTRIBUTED BY clause",,,,,,"CREATE TABLE pl_acct_mtd_vol (    comments character(25)) DISTRIBUTED BY (prd_end_date ,acct_id ,acct_port_id ,acct_id ,contr_date ,record_seq);",0,,"postgres.c",1543,


Database restore on 4.2.1.x and higher versions fail to restore the database from backup when distribution columns in the dump are duplicated. This is because the dump in version of Greenplum does not check for the dumped content or distribution policy, however, when the same dump is used to restore the database, CREATE TABLE executed by restore validates and disqualifies the CREATE TABLE statement which has duplicated distribution columns. 

This can occur when a customer has upgraded from as this version allows for a user to create a table with duplicated distribution columns. For example:

gpadmin=# \d rexall
Table "public.rexall"
Column | Type | Modifiers
col1 | integer |
col2 | money |
col3 | character varying |
Distributed by: (col1,col1)


  • Identify the table with duplicated columns by reviewing the master log file for errors like the following:
"ERROR","42701","duplicate column "
  • Modify the source distribution key by redistributing the table:
ALTER TABLE rexall SET redistributed by (col1);
  • Perform the dump again for only this and similar tables and then restore to target.


Powered by Zendesk