Pivotal Knowledge Base

Follow

Resolve ORDER of NULL values with FIRST_VALUE window function

Environment

Product Version
Pivotal Greenplum (GPDB) All versions

Symptom

Window functions provide the ability to perform calculations across sets of rows that are related to the current query row. If you want to use FIRST_VALUE window function ORDER BY some column with NULLs, then NULLs are always LAST when we expect the NULL as FIRST.

Resolution

In Greenplum (GPDB), when FIRST_VALUE window function is used, then it will not take NULL as FIRST value.

For example;

The source data is like below

[source data]
Table Name: Employees
dept_id salary
-------------
10 1000
10 1000
10 2000
10 3000
20 5000
20 6000
20 NULL


a1=# SELECT dept_id,
salary,
FIRST_VALUE(salary) OVER(ORDER BY salary) first_sal
FROM Employees;

DEPT_ID SALARY FIRST_SAL
20 NULL NULL 
10 1000 NULL
10 1000 NULL
10 2000 NULL
10 3000 NULL
20 5000 NULL
20 6000 NULL

a1=# SELECT dept_id,
salary,
FIRST_VALUE(salary) OVER(PARTITION BY dept_id 
ORDER BY salary) group_first
FROM Employees;
DEPT_ID SALARY GROUP_FIRST
10 1000 1000
10 1000 1000
10 2000 1000
10 3000 1000
20 NULL NULL
20 5000 NULL
20 6000 NULL

In Greenplum, we will have to modify the query as below:

a1=# SELECT dept_id,
salary,
FIRST_VALUE(salary) OVER(ORDER BY (salary is not null),salary) first_sal
FROM Employees;
dept_id | salary | first_sal 
---------+--------+-----------
20 | | 
10 | 1000 | 
10 | 1000 | 
10 | 2000 | 
10 | 3000 | 
20 | 5000 | 
20 | 6000 | 
(7 rows)

a1=# SELECT dept_id,
salary , 
FIRST_VALUE(salary) OVER(PARTITION BY dept_id 
ORDER BY (salary is not null),salary) group_first
FROM Employees;
dept_id | salary | group_first 
---------+--------+-------------
10 | 1000 | 1000
10 | 1000 | 1000
10 | 2000 | 1000
10 | 3000 | 1000
20 | 	  | 
20 | 5000 | 
20 | 6000 | 
(7 rows)

If you are concerned about displaying NULL in the output, modify the query using COALESCE function. Below is an example to get the same output:

a1=# SELECT dept_id,
coalesce(salary::text, 'NULL'),
coalesce((FIRST_VALUE(salary) OVER(ORDER BY (salary is not null),salary))::text, 'NULL') as first_sal
FROM Employees;
dept_id | coalesce | first_sal 
---------+----------+-----------
20 | NULL | NULL
10 | 1000 | NULL
10 | 1000 | NULL
10 | 2000 | NULL
10 | 3000 | NULL
20 | 5000 | NULL
20 | 6000 | NULL
(7 rows)

a1=# SELECT dept_id,
coalesce(salary::text,'NULL') , 
coalesce((FIRST_VALUE(salary) OVER(PARTITION BY dept_id 
ORDER BY (salary is not null),salary))::text, 'NULL') group_first
FROM Employees;
dept_id | coalesce | group_first 
---------+----------+-------------
20 | NULL | NULL
20 | 5000 | NULL
20 | 6000 | NULL
10 | 1000 | 1000
10 | 1000 | 1000
10 | 2000 | 1000
10 | 3000 | 1000
(7 rows)

Comments

Powered by Zendesk