- Pivotal Greenplum (GPDB) 4.3.5.x and onward
- OS RHEL 6.x
The purpose of this document is to give you a quick insight into how the Pivotal Query Optimizer works. We also discuss the strengths of PQO and when not to use PQO.
Pivotal Query Optimizer (PQO) Further Explained
Pivotal’s Query Optimizer (PQO) is designed to find the optimal way to execute user queries in distributed environments such as Pivotal’s Greenplum Database and HAWQ. The open source version of PQO is called GPORCA. To generate the fastest plan, GPORCA considers thousands of alternative query execution plans and makes a cost-based decision.
The legacy planner is a derivative of the original PostgreSQL planner that was adapted to the Greenplum code base initially. The PostgreSQL planner was originally built for a single-node PostgreSQL, optimized for OLTP (Online Transaction Processing) queries. In contrast, an MPP engine is built for long running OLAP (Online Analytical Processing) queries. For this reason, the PostgreSQL planner was not built with an MPP database in mind. While features like "join ordering" were carefully thought out, the architecture and design choices make maintenance and adding new features increasingly difficult.
What makes GPORCA particularly useful is the ability to generate efficient code for some of the complex situations that commonly arise in the analytic DWHs. These include:
- Smarter partition elimination
- Subquery unnesting
- Common table expressions (CTE)
- Multi-level partitioning
- Improved join ordering
- Join aggregate re-ordering
- Sort order optimization
- Skew awareness
Previously, the Legacy Query Optimizer was set as the default, but as of Greenplum 5.0, GPORCA is the default query optimizer. This default can be changed at the Instance level, database level, or the session level by setting the GUC “optimizer = on.” When enabling GPORCA, we request users or DBAs to ensure that statistics have been collected on the root partition of a partitioned table. This is because, unlike the Legacy Planner, GPORCA uses the statistics at the root partitions rather than using statistics of individual leaf partitions.
Below example shows how to change the default at Greenplum Instance, Database, User and Session Levels:
gpconfig -c optimizer -v on gpstop -u
alter database corrupt set optimizer=on; gpstop -u
ALTER USER test set optimizer=on ;
Let’s look at an example:
CREATE TABLE part ( p_partkey integer NOT NULL, p_name character varying(55) NOT NULL, p_mfgr character(25) NOT NULL, p_brand character(10) NOT NULL, p_type character varying(25) NOT NULL, p_size integer NOT NULL, p_container character(10) NOT NULL, p_retailprice numeric(15,2) NOT NULL, p_comment character varying(23) NOT NULL ) distributed by (p_partkey);
Consider the following correlated query that fetches all parts with sizes greater than 40 or retail price greater than the average price of all parts that have the same brand:
explain select * from part p1 where p_size > 40 or p_retailprice > (select avg(p_retailprice) from part p2 where p2.p_brand=p1.p_brand);
As shown in the following explain plan produced by GPORCA, the optimizer status denotes the version of GPORCA (a.k.a PQO) used to generate the plan:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Gather Motion 2:1 (slice3; segments: 2) (cost=0.00..862.00 rows=1 width=64) -> Result (cost=0.00..862.00 rows=1 width=64) Filter: public.part.p_size > 40 OR public.part.p_retailprice > avg -> Hash Left Join (cost=0.00..862.00 rows=1 width=72) Hash Cond: public.part.p_brand = public.part.p_brand -> Redistribute Motion 2:2 (slice1; segments: 2) (cost=0.00..431.00 rows=1 width=64) Hash Key: public.part.p_brand -> Table Scan on part (cost=0.00..431.00 rows=1 width=64) -> Hash (cost=431.00..431.00 rows=1 width=16) -> GroupAggregate (cost=0.00..431.00 rows=1 width=16) Group By: public.part.p_brand -> Sort (cost=0.00..431.00 rows=1 width=16) Sort Key: public.part.p_brand -> Redistribute Motion 2:2 (slice2; segments: 2) (cost=0.00..431.00 rows=1 width=16) Hash Key: public.part.p_brand -> Result (cost=0.00..431.00 rows=1 width=16) -> GroupAggregate (cost=0.00..431.00 rows=1 width=16) Group By: public.part.p_brand -> Sort (cost=0.00..431.00 rows=1 width=16) Sort Key: public.part.p_brand -> Table Scan on part (cost=0.00..431.00 rows=1 width=16) Settings: optimizer=on Optimizer status: PQO version 2.1 (23 rows)
In comparison, the snippet below shows a Legacy Query Optimizer plan that employs a correlated execution strategy:
QUERY PLAN ------------------------------------------------------------------------------------------------------------------ Gather Motion 2:1 (slice2; segments: 2) (cost=0.00..1153607.20 rows=3556 width=476) -> Seq Scan on part p1 (cost=0.00..1153607.20 rows=1778 width=476) Filter: p_size > 40 OR p_retailprice > ((subplan)) SubPlan 1 -> Aggregate (cost=180.21..180.22 rows=1 width=32) -> Result (cost=180.01..180.07 rows=4 width=16) Filter: p2.p_brand = $0 -> Materialize (cost=180.01..180.07 rows=4 width=16) -> Broadcast Motion 2:2 (slice1; segments: 2) (cost=0.00..180.00 rows=4 width=16) -> Seq Scan on part p2 (cost=0.00..180.00 rows=4 width=16) Optimizer status: legacy query optimizer (11 rows)
Note: The cost models used by the two optimizers are different. For instance, the top node for the GPORCA plan has the cost of 431, while that of the legacy query optimizer is 1153607.20. These numbers make sense within a particular optimizer, but they are not comparable between the two different optimizers.
GPORCA excels on partitioned tables. By comparison, the Legacy Query Optimizer can only eliminate partitions statically. For example, if a table is partitioned by date, a WHERE clause that limits the date range would eliminate any partitions in which the limited date range could not occur. However, it can not handle dynamic conditions in which the WHERE clause has a subquery that determines the range. Furthermore, many large fact tables in a DWH may have a significantly large number of partitions. The Legacy Planner could encounter Out Of Memory (OOM) errors in cases where GPORCA would not.
Consider the example above that fetches parts with size > 40 or retail price greater than the average price of all parts that have the same brand. In the plan generated above by the Legacy Query Optimizer, for the tuple in the outer part table p1, the plan executes a sub plan that computes the average part price of all parts having the same brand as the tuple from table part p1. This computed intermediate value is used to determine if that tuple in p1 will be in the query result or not. Since the Legacy Query Optimizer plan repeatedly executes the sub plan for each tuple in the part table p1, the plan is considered a correlated execution. Such a correlated plan is suboptimal because it does extraneous work which could be avoided. In the worst case scenario, if all the parts belong to the same brand, we will be computing the average price one too many times.
In contrast, GPORCA generates a de-correlated plan where it first computes the average price for each brand. This is done only once. The intermediate results are then joined with the table parts to generate a list of parts that meet the user’s criteria.
As more capabilities are added to GPORCA over time, better performance of the Legacy Query Optimizer would be rare.
Use the Legacy Planner when performing single row insert or simple queries (That takes a couple of seconds to run).
When does ORCA fall back to planner:
- PERCENTILE window function
- External parameters
- SortMergeJoin (SMJ)
- CUBE operator
- Multiple grouping sets
- Utility and DDL commands (e.g. Alter Table)
- Small subset of scalar function such as row compare
- Catalog queries