Pivotal Knowledge Base

Follow

How to aggregate multiple columns of a table into one column of 2-dimensional array

Goal

The embedded function array_agg() in Pivotal Greenplum database can only do aggregation to 1-dimensional array. This article introduces a way to aggregate data of several columns of a table into one column with type of 2-dimensional array. 

Solution

In order to aggregate multiple columns of one table to one column of 2-dimensional array, a new AGGREGATE type need be defined to realize aggregation at array level.

CREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND ='{}');

Then use array_agg_mult(array[[col,...]]) in SQL statement to aggregate data of multiple columns to a 2-dimensional array.

NOTE: please test it thoroughly before implement this solution on production system.

Example

This example is to demonstrate how to do 2-dimensional array aggregation with user-defined AGGREGATE type. 

Create table t1 with several columns and insert some rows of data into t1. Keep value of first column "id" identical among rows for aggregation test.

apj_4400=# create table t1(id int, c1 text, c2 text, c3 text) distributed by (id);
CREATE TABLE

apj_4400=# \d t1
Table "public.t1"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
c1 | text |
c2 | text |
c3 | text |
Distributed by: (id)

apj_4400=# select * from t1;
id | c1 | c2 | c3
----+--------+--------+--------
1 | col1-3 | col2-3 | col3-3
1 | col1-2 | col2-2 | col3-2
1 | col1-1 | col2-1 | col3-1
(3 rows)

Create another table t2 as following one. Column "carray" is defined as type of 2-dimensional array text

apj_4400=# create table t2(id int, cArray text[][]) distributed by (id);
CREATE TABLE
apj_4400=# \d t2
Table "public.t2"
Column | Type | Modifiers
--------+---------+-----------
id | integer |
carray | text[] |
Distributed by: (id)

Define AGGREGATE as mentioned above.

apj_4400=# CREATE AGGREGATE array_agg_mult (anyarray) ( 
SFUNC = array_cat ,
STYPE = anyarray ,
INITCOND ='{}');
CREATE AGGREGATE;

Insert data to table t2 by selecting data from t1 with array_agg_mult([[...]]) to aggregate columns "c1, c2, c3" of t1 to "carray" of t2 with 2-dimensional array.

apj_4400=# insert into t2 select id ,array_agg_mult(array[[c1, c2, c3]]) from t1 group by id;
INSERT 0 1

apj_4400=# select * from t2;
id | carray
----+------------------------------------------------------------------------
1 | {{col1-3,col2-3,col3-3},{col1-2,col2-2,col3-2},{col1-1,col2-1,col3-1}}
(1 row)

apj_4400=# select carray[1][2] from t2;
carray
--------
col2-3
(1 row)

Reference

http://stackoverflow.com/questions/11762398/selecting-data-into-a-postgres-array-format/11763245#11763245

http://stackoverflow.com/questions/16771850/postgres-rows-to-2-dimensional-array

Comments

Powered by Zendesk