- Pivotal Greenplum Database (GPDB) 4.3.x
- Operating System- Red Hat Enterprise Linux 6.x
When attempting to run a non-deterministic update on a table with the Pivotal Optimizer (SET optimizer=on), an error occurs.
Note- These entries will come directly from the client tool in use (eg. psql):
ERROR: multiple updates to a row by the same query is not allowed
The Pivotal Query Optimizer, unlike the legacy optimizer, does not allow non-deterministic updates. This ensures that only a specific unique row is updated and not an undetermined one when there are multiple matches.
Consider the following:
gpadmin=# create table table01 ( a int, b int) distributed by (a); CREATE TABLE gpadmin=# create table table02 ( x int, y int) distributed by (x); CREATE TABLE gpadmin=# INSERT INTO table01 values (1, 1),(2,1),(3,1); INSERT 0 3 gpadmin=# INSERT INTO table02 values (1, 10),(1,11),(2,20),(3,30),(1,12); INSERT 0 5 gpadmin=# select * from table01 order by 1; a | b ---+--- 1 | 1 2 | 1 3 | 1 (3 rows) gpadmin=# select * from table02 order by 1; x | y ---+---- 1 | 11 1 | 10 1 | 12 2 | 20 3 | 30 (5 rows) gpadmin=# set optimizer=on; SET gpadmin=# UPDATE table01 t1 SET b=t2.y from table02 t2 where t1.a = t2.x; ERROR: multiple updates to a row by the same query is not allowed (seg0 gpdb:40000 pid=79372)
Since table03.x has multiple "1" values in it, when the update occurs which row would be updated? Since this is a non-deterministic update the error is produced.
However, the legacy optimizer will proceed to update a value but in this scenario, it could update table01 with any value from table02 where x=1:
gpadmin=# set optimizer=off; SET gpadmin=# UPDATE table01 t1 SET b=t2.y from table02 t2 where t1.a = t2.x; UPDATE 3 test4352=# select * from table01 order by 1; a | b ---+---- 1 | 10 2 | 20 3 | 30 (3 rows)
This error and activity are documented behavior for the Pivotal Optimizer.
See http://gpdb.docs.pivotal.io/4350/admin_guide/query/topics/query-piv-opt-changed.html for more information.Ensure uniqueness in your tables to make sure that there is not a non-deterministic update attempted. Building on the example above, one could check for uniqueness as below and can see that this kind of update has 3 potential rows that could be updated:
gpadmin=# SELECT x,count(*) FROM table02 a WHERE EXISTS (SELECT * FROM table01 b WHERE a.x=b.a) GROUP BY x HAVING count(*)>1; x | count ---+------- 1 | 3 (1 row)
Use a different predicate/WHERE clause to ensure only individual rows are updated.
Attempt using the legacy optimizer as a workaround.