Pivotal Knowledge Base

Follow

UNIQUE btree index lost after restoring Greenplum from an older version

Environment

Pivotal Greenplum: 4.3.x

OS: RHEL 6.x

Symptom

When migrating the Greenplum system (release 4.3.3.1) from one platform to a new platform (installed latest Greenplum software release 4.3.13.0), it's found that the unique btree index for one table disappeared after migration. 

Detailed Messages:

Table mcc_mart.dim_issuer_instalments has one primary key index and one UNIQUE btree index before migration.

uatmccdwh001=# \d mcc_mart.dim_issuer_instalments
Table "mcc_mart.dim_issuer_instalments"
Column | Type | Modifiers
--------------------------------+-----------------------------+-----------
instalment_id | integer | not null
instalment_bkey | character(30) | not null
data_source_id | integer | not null
......
Indexes:
"dim_issuer_instalments_pkey" PRIMARY KEY, btree (instalment_id)
"dim_issuer_instalments_instalment_bkey_key" UNIQUE, btree (instalment_bkey, data_source_id)
......
Distributed by: (instalment_bkey, data_source_id) uatmccdwh001=# \di mcc_mart.dim_issuer*
List of relations
Schema | Name | Type | Owner | Storage | Table
----------+--------------------------------------------+-------+---------+---------+------------------------
mcc_mart | dim_issuer_instalments_instalment_bkey_key | index | gpadmin | heap | dim_issuer_instalments
mcc_mart | dim_issuer_instalments_pkey | index | gpadmin | heap | dim_issuer_instalments
(2 rows)

After migration, it has only one primary key index

prdmccdwh001=# \d mcc_mart.dim_issuer_instalments
Table "mcc_mart.dim_issuer_instalments"
Column | Type | Modifiers
--------------------------------+-----------------------------+-----------
instalment_id | integer | not null
instalment_bkey | character(30) | not null
data_source_id | integer | not null
......
Indexes:
"dim_issuer_instalments_pkey" PRIMARY KEY, btree (instalment_id)
......
Distributed by: (instalment_id) prdmccdwh001=# \di mcc_mart.dim_issuer*
List of relations
Schema | Name | Type | Owner | Storage | Table
----------+-----------------------------+-------+---------+---------+------------------------
mcc_mart | dim_issuer_instalments_pkey | index | gpadmin | heap | dim_issuer_instalments
(1 row)

Cause  

Since Greenplum database release 4.3.5.0 enhancement was introduced to enforce primary/distribution key restriction, the primary key and the distribution key must be the same.

For this particular case, when table mcc_mart.dim_issuer_instalments was created on 4.3.3.1, it is still possible to define primary key that is different from distribution key. However, when the DDL of this table was exported with pq_dump/gpcrondump and imported to the release 4.3.5.0 onwards, this table won't be created exactly as it was before due to the primary/distribution key restriction. 

Following is the exported DDL of the table mcc_mart.dim_issuer_instalments

CREATE TABLE dim_issuer_instalments (
instalment_id integer NOT NULL,
instalment_bkey character(30) NOT NULL,
data_source_id integer NOT NULL,
......
) DISTRIBUTED BY (instalment_bkey ,data_source_id); ALTER TABLE mcc_mart.dim_issuer_instalments OWNER TO gpadmin; --
-- Name: dim_issuer_instalments_pkey; Type: CONSTRAINT; Schema: mcc_mart; Owner: gpadmin; Tablespace:
-- ALTER TABLE ONLY dim_issuer_instalments
ADD CONSTRAINT dim_issuer_instalments_pkey PRIMARY KEY (instalment_id);
--
-- Name: instalments_unique_key; Type: CONSTRAINT; Schema: mcc_mart; Owner: gpadmin; Tablespace:
-- ALTER TABLE ONLY dim_issuer_instalments
ADD CONSTRAINT instalments_unique_key UNIQUE (instalment_bkey, data_source_id);

Messages for importing the DDL

CREATE TABLE
ALTER TABLE
psql:3.sql:64: NOTICE:  updating distribution policy to match new primary key
psql:3.sql:64: NOTICE:  ALTER TABLE / ADD PRIMARY KEY will create implicit index "dim_issuer_instalments_pkey" for table "dim_issuer_instalments"
ALTER TABLE
psql:3.sql:72: ERROR:  UNIQUE index must contain all columns in the distribution key of relation "dim_issuer_instalments"

As shown in the above messages, distribution key will be forcefully updated to match the primary key. And the UNIQUE index will fail to be created due to a conflict with the new distribution key.

Resolution

Options to resolve this issue:

  • Due to the primary/distribution key restriction, the table's DDL needs to be rectified first after being exported with pg_dump/gpcrondump to make sure the restriction is conformed. Then it could be imported to the Greenplum system
  • Or, if the DDL in the old system must be used, then the following workaround might be considered. Please keep in mind that the same issue will happen again with the next backup/restore operation once this workaround is applied. 
    • Restore system on the new platform with backup from the old one
    • Stop the database and downgrade the Greenplum binary back to the previous release (older than 4.3.5.0) after the restore completes on a new platform
    • Start up the database and alter the distribution policy and add UNIQUE index for the table to make it same as before. NOTE: NO other changes should be done when the system is on an old release
    • Stop the database and change the Greenplum binary back to the new release
    • Start up the database

 

Comments

Powered by Zendesk