Pivotal Knowledge Base

Follow

How to Aggregate Multiple Columns of a Table into One Column of a Two-Dimensional Array

Environment

Pivotal Greenplum Database (GPDB) all version

Purpose

The embedded function array_agg() in Pivotal Greenplum database can only do aggregation to a one-dimensional array. This article introduces a way to aggregate data of several columns of a table into one column was a type of two-dimensional array.

Procedure

In order to aggregate multiple columns of one table to one column of the two-dimensional array, a new AGGREGATE type need be defined to realize aggregation the at array level.

CREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND ='{}');

Then use array_agg_mult(array[[col,...]]) in SQL statement to aggregate data of multiple columns to a 2-dimensional array.

Note- Please test it thoroughly before implementing this solution on the production system.

Example

This example is to demonstrate how to do 2-dimensional array aggregation with user-defined AGGREGATE type. 

Create table t1 with several columns and insert some rows of data into t1. Keep the value of first column "id" identical among rows for aggregation test.

apj_4400=# create table t1(id int, c1 text, c2 text, c3 text) distributed by (id);
CREATE TABLE

apj_4400=# \d t1
Table "public.t1"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
c1 | text |
c2 | text |
c3 | text |
Distributed by: (id)

apj_4400=# select * from t1;
id | c1 | c2 | c3
----+--------+--------+--------
1 | col1-3 | col2-3 | col3-3
1 | col1-2 | col2-2 | col3-2
1 | col1-1 | col2-1 | col3-1
(3 rows)

Create another table t2 as following one. Column "carray" is defined as a type of 2-dimensional array text

apj_4400=# create table t2(id int, cArray text[][]) distributed by (id);
CREATE TABLE
apj_4400=# \d t2
Table "public.t2"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
carray | text[] |
Distributed by: (id)

Define AGGREGATE as mentioned above.

apj_4400=# CREATE AGGREGATE array_agg_mult (anyarray) ( 
SFUNC = array_cat ,
STYPE = anyarray ,
INITCOND ='{}');
CREATE AGGREGATE;

Insert data to table t2 by selecting data from t1 with array_agg_mult([[...]]) to aggregate columns "c1, c2, c3" of t1 to "carray" of t2 with 2-dimensional array.

apj_4400=# insert into t2 select id ,array_agg_mult(array[[c1, c2, c3]]) from t1 group by id;
INSERT 0 1

apj_4400=# select * from t2;
id | carray
----+------------------------------------------------------------------------
1 | {{col1-3,col2-3,col3-3},{col1-2,col2-2,col3-2},{col1-1,col2-1,col3-1}}
(1 row)

apj_4400=# select carray[1][2] from t2;
carray
--------
col2-3
(1 row)

Additional Information

http://stackoverflow.com/questions/11762398/selecting-data-into-a-postgres-array-format/11763245#11763245

http://stackoverflow.com/questions/16771850/postgres-rows-to-2-dimensional-array

Comments

Powered by Zendesk