Pivotal Knowledge Base

Follow

Dynamic Partition Elimination (DPE) with Pivotal Query Optimizer (GPORCA)

Environment

  • Pivotal Greenplum Database (GPDB) 4.3.x
  • Operating System- Red Hat Enterprise Linux 6.x

Introduction

The purpose of this document is to give insight into how Dynamic Partition Elimination works with Pivotal Query Optimizer (GPORCA) compared to Legacy Optimizer Partition Elimination.

Description

Consider the following query and resulting GPORCA plan:

mydb=# set optimizer=on;
SET
mydb=# EXPLAIN ANALYZE select * FROM my_schema.myfactsview AS myFacts LEFT OUTER JOIN schema_one.calendar_table AS myCalendarData ON myFacts.invoice_date = myCalendarData.end_dte WHERE myCalendarData.end_dte = '2017-05-31';
                                                              QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=0.00..1181129.89 rows=315984024 width=330)
   Hash Cond: calendar_table.end_dte = myfactstable.invoice_date
   Rows out:  0 rows with 11 ms to end, start offset by 48 ms.
   ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.44 rows=102 width=39)
         Rows out:  (No row requested) 0 rows at destination with 0 ms to end.
         ->  Table Scan on calendar_table  (cost=0.00..431.43 rows=34 width=39)
               Filter: end_dte = '2017-05-31'::date
               Rows out:  0 rows (seg0) with 1.680 ms to end, start offset by 62 ms.
   ->  Hash  (cost=13765.86..13765.86 rows=3111049 width=291)
         Rows in:  0 rows with 0.014 ms to end, start offset by 60 ms.
         ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..13765.86 rows=9333146 width=291)
               Rows out:  0 rows at destination with 0.012 ms to end, start offset by 60 ms.
               ->  Sequence  (cost=0.00..3644.43 rows=3111049 width=291)
                     Rows out:  0 rows (seg0) with 2.794 ms to end, start offset by 60 ms.
                     ->  Partition Selector for myfactstable (dynamic scan id: 1)  (cost=10.00..100.00 rows=34 width=4)
                           Filter: myfactstable.invoice_date = '2017-05-31'::date
                           Partitions selected:  1 (out of 60)
                           Rows out:  0 rows (seg0) with 0.010 ms to end, start offset by 60 ms.
                     ->  Dynamic Table Scan on myfactstable (dynamic scan id: 1)  (cost=0.00..3644.43 rows=3111049 width=291)
                           Filter: invoice_date = '2017-05-31'::date
                           Rows out:  0 rows (seg0) with 2.784 ms to end, start offset by 60 ms.
                           Partitions scanned:  Avg 1.0 (out of 60) x 3 workers.  Max 1 parts (seg0).
 Slice statistics:
   (slice0)    Executor memory: 16797K bytes.
   (slice1)    Executor memory: 257K bytes avg x 3 workers, 257K bytes max (seg0).
   (slice2)    Executor memory: 3698K bytes avg x 3 workers, 3698K bytes max (seg0).
 Statement statistics:
   Memory used: 1048576K bytes
 Settings:  effective_cache_size=512MB; gp_cte_sharing=on; gp_enable_relsize_collection=on; optimizer=on
 Optimizer status: PQO version 2.39.0
 Total runtime: 59.773 ms
(31 rows)

Regarding Dynamic Partition Elimination shown above:

  • From the condition myFacts.invoice_date = myCalendarData.end_dte WHERE myCalendarData.end_dte = '2017-05-31'; GPORCA generates another predicate myFacts.invoice_date = '2017-05-31'
  • Partition Elimination is happening but happening dynamically, known as Dynamic Partition Elimination (DPE). This is done based on the join condition myFacts.invoice_date = myCalendarData.end_dte
  • If you see the "Dynamic Table Scan" in the output of EXPLAIN ANALYZE, this means dynamic partition elimination is happening. Here, we see that on each segment, an average of 1.0 out of 60 partitions has been scanned:
Dynamic Table Scan on myfactstable (dynamic scan id: 1)
(cost=0.00..3644.43 rows=3111049 width=291)
Filter: invoice_date = '2017-05-31'::date
Rows out: 0 rows (seg0) with 2.784 ms to end, start offset by 60 ms.
Partitions scanned: Avg 1.0 (out of 60) x 3 workers. Max 1 parts (seg0).

Notice:

"Partitions scanned:  Avg 1.0 (out of 60) x 3 workers.  Max 1 parts (seg0)."

How Dynamic Partition Elimination works:

For every HashJoin operation, the Hash side is the one that gets executed first. In the following subtree, we first scan calendar_table, apply the filter end_dte = '2017-05-31'::date. Next, we feed the resultant tuples from calendar_table to a Partition Selector operator, that decides which of the partitions from myfactstable need to be scanned:

Hash (cost=13765.86..13765.86 rows=3111049 width=291)
Partition Selector for myfactstable (dynamic scan id: 1) (cost=10.00..100.00 rows=34 width=4)
Filter: myfactstable.invoice_date = '2017-05-31'::date
Partitions selected:1 (out of 60)
Rows out: 0 rows (seg0) with 0.010 ms to end, start offset by 60 ms.

Difference between Legacy Optimizer and GPORCA Optimizer Plans:

  • Legacy Query Optimizer uses Nested Loop Join while GPORCA uses Hash Join

  • Legacy Query Optimizer does "Static" partition elimination while GPORCA Optimizer does "Dynamic" Partition Elimination

The same query when using Legacy Optimizer:

mydb=# set optimizer=off;
SET
mydb=# EXPLAIN ANALYZE select * FROM my_schema.myfactsview AS myFacts LEFT OUTER JOIN schema_one.calendar_table AS myCalendarData ON myFacts.invoice_date = myCalendarData.end_dte WHERE myCalendarData.end_dte = '2017-05-31';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..334.76 rows=6 width=1585)
   Rows out:  0 rows at destination with 6.980 ms to end, start offset by 22 ms.
   ->  Nested Loop  (cost=0.00..334.76 rows=2 width=1585)
         Rows out:  0 rows (seg0) with 4.179 ms to end, start offset by 31 ms.
         ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..334.59 rows=6 width=39)
               Rows out:  0 rows at destination (seg0) with 4.175 ms to end, start offset by 31 ms.
               ->  Seq Scan on calendar_table mycalendardata  (cost=0.00..334.36 rows=2 width=39)
                     Filter: end_dte = '2017-05-31'::date AND '2017-05-31'::date = end_dte
                     Rows out:  0 rows (seg0) with 0.053 ms to end, start offset by 32 ms.
         ->  Append  (cost=0.00..0.00 rows=1 width=1546)
               Rows out:  (No row requested) 0 rows (seg0) with 0 ms to end.
               ->  Append-only Columnar Scan on myfactstable_1_prt_17 myfactstable  (cost=0.00..0.00 rows=1 width=1546)
                     Filter: '2017-05-31'::date = invoice_date AND invoice_date = '2017-05-31'::date
                     Rows out:  (No row requested) 0 rows (seg0) with 0 ms to end.
 Slice statistics:
   (slice0)    Executor memory: 769K bytes.
   (slice1)    Executor memory: 279K bytes avg x 3 workers, 279K bytes max (seg0).
   (slice2)    Executor memory: 251K bytes avg x 3 workers, 251K bytes max (seg0).
 Statement statistics:
   Memory used: 1048576K bytes
 Settings:  effective_cache_size=512MB; gp_cte_sharing=on; gp_enable_relsize_collection=on; optimizer=off
 Optimizer status: legacy query optimizer
 Total runtime: 29.706 ms
(23 rows)
mydb=#

Additional Information

To see which partitions are actually being scanned, you can set the following:

set client_min_messages = 'log';

set gp_partitioning_dynamic_selection_log=on;

Comments

Powered by Zendesk