Pivotal Knowledge Base


Non-Deterministic Updates are not Supported


Pivotal Greenplum Database (GPDB) 4.3.x


ERROR: Multiple updates to a row by a single query is not supported


  1. A question you might ask yourself is Why was this implemented in GPDB Pivotal Query Optimizer? All other databases that I work with allow this option (including postgres and Greenplum prior to Pivotal Query Optimizer)

    Imagine this scenario, we want to join two tables:

    CREATE TABLE T1 ( a int, b int );
    CREATE TABLE T2 ( c int, d int );

    Then populate these tables with the following input:

    INSERT INTO T1 VALUES ( 1, 2 );
    INSERT INTO T2 VALUES ( 1, 7 );
    INSERT INTO T2 VALUES ( 1, 4 );

    The following query will try to update the value T1.b with the joining tuples of T2:

    UPDATE T1 SET  b = d  FROM T2 WHERE T1.a = T2.c ;

    What should the value of the only joining tuple in T1 be? (1,7) or (1,4)?  Given this, we decided to disallow queries that have a non-deterministic output.

     It is true that some systems (including our previous Postgres-based semantics) allow performing updates under such circumstances. However, we decided not to support anymore these types of queries for their ambiguous nature (especially in an MPP setting). In addition, when we extended the DML framework to support updates on the distribution key and the partitioning key, we concluded that allowing the above non-deterministic behavior is error-prone, as it may introduce duplicates that are expensive to detect and prevent.

  2. What is the scope of the error? Because it seems to work for the GPLOAD and psql interactive and not during a psql with file option.

    The enforcement of the semantics is extensible to all UPDATE statements that run under the new Pivotal Query Optimizer (e.g. psql or JDBC). If the semantics are not enforced under a particular scenario (please review the state of Optimizer GUC in a non-working scenario), please report this incidence to us.


Powered by Zendesk