Pivotal Knowledge Base

Follow

How To Convert XML Data into Text Format

 

Environment

 Product  Version
 Pivotal Greenplum  All Versions

Purpose

Sometimes, we need to convert the XML data into text format to manage the data using SQL, but the xpath data won't allow us to use the cast function as postgresql to change the format directly. When it comes to this situation, we need to use the text function as a workaround to meet our requirements.

Cause

For example, we can use the xpath to create a new table using the xpath's internal cast function as below, but the type of the function would still be XML:

Note: The '<foo>bar</foo>'::xml is the XML source data, and the //foo/text() will ask xpath to convert the xml raw data to the actual value.

gpadmin=# create TABLE test11 as select (xpath('//foo/text()'::text, '<foo>bar</foo>'::xml))[1];NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named '' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
SELECT 1
gpadmin=# \d+ test11;
Table "public.test11"
Column | Type | Modifiers | Storage | Description
--------+------+-----------+----------+-------------
xpath | xml | | extended |
Has OIDs: no
Distributed randomly 

Solution

We can use the text() function to convert all the xpath output to text format as shown below: 

gpadmin=# create TABLE test222 as select text((xpath('//foo/text()'::text, '<foo>bar</foo>'::xml))[1]);
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column(s) named 'text' as the Greenplum Database data distribution key for this table.
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
SELECT 1
gpadmin=#
gpadmin=#
gpadmin=# \d+ test222;
               Table "public.test222"
 Column | Type | Modifiers | Storage  | Description
--------+------+-----------+----------+-------------
 text   | text |           | extended |
Has OIDs: no
Distributed by: (text)

gpadmin=# SELECT * from test222;
 text
------
 bar
(1 row)

Comments

Powered by Zendesk