Pivotal Knowledge Base

Follow

How to check for spill files in HAWQ 2.x

Environment

Pivotal HDB: 2.x

OS: RHEL 6.x

Question

How can I check for spill files in HAWQ 2.x? 

Resolution

There is no easy method to check for spill files in HAWQ 2.x. There are discussions about getting this into Ambari, but until then you could choose to create a view that could check for this.

Follow the steps below to create the view that can check for spill files.

  • Psql into your Hawq database of choice
[gpadmin@hdp3 ~]$ psql gpadmin
psql (8.2.15)
Type "help" for help.

gpadmin=#
  • Create the view using the below command and make sure that you replace the <TEMP_PATHS_HERE_REPLACE_ME> with your Hawq temp directories. You can find this information by the following Hawq command "hawq config -s hawq_segment_temp_directory":

  • Create view command

DROP VIEW IF EXISTS spill_files;
DROP EXTERNAL TABLE IF EXISTS spills ;

CREATE EXTERNAL WEB TABLE spills (size text, path text) 
  EXECUTE 'du -sh <TEMP_PATHS_HERE_REPLACE_ME>'
  ON <number of segment hosts>
  FORMAT 'TEXT'
    (DELIMITER E'\t') ;

CREATE VIEW spill_files
AS select gp_segment_id AS seg_id,* from spills order by 1;
  • How to find your Hawq temp directories
[gpadmin@hdp3 ~]$ hawq config -s hawq_segment_temp_directory
GUC     : hawq_segment_temp_directory
Value       : /data/hawq/tmp/segment,/data/hawq/temp/segment
  • The modified version of the Create view command
DROP VIEW IF EXISTS spill_files;
DROP EXTERNAL TABLE IF EXISTS spills ;

CREATE EXTERNAL WEB TABLE spills (size text, path text) 
  EXECUTE 'du -sh /data/hawq/tmp/segment /data/hawq/temp/segment'
  ON 3
  FORMAT 'TEXT'
    (DELIMITER E'\t') ;

CREATE VIEW spill_files
AS select gp_segment_id AS seg_id,* from spills order by 1;
  • Create view command in action
gpadmin=# DROP EXTERNAL TABLE IF EXISTS spills ;
DROP EXTERNAL TABLE
gpadmin=#
gpadmin=# CREATE EXTERNAL WEB TABLE spills (size text, path text)
gpadmin-#   EXECUTE 'du -sh /data/hawq/tmp/segment /data/hawq/temp/segment'
gpadmin-# ON 3 gpadmin-# FORMAT 'TEXT'
gpadmin-# (DELIMITER E'\t') ;
CREATE EXTERNAL TABLE
gpadmin=#
gpadmin=# CREATE VIEW spill_files
gpadmin-# AS select gp_segment_id AS seg_id,* from spills order by 1;
CREATE VIEW
  • Check that you created the view
gpadmin=# \dv spill_files
                List of relations
 Schema |    Name     | Type |  Owner  | Storage
--------+-------------+------+---------+---------
 public | spill_files | view | gpadmin | none
(1 row)
  • Now, let's check for spill files
gpadmin=# select * from public.spill_files;
seg_id | size | path
--------+------+-------------------------
0 | 4.0K | /data/hawq/temp/segment
0 | 4.0K | /data/hawq/tmp/segment
1 | 4.0K | /data/hawq/temp/segment
1 | 4.0K | /data/hawq/tmp/segment
2 | 4.0K | /data/hawq/temp/segment
2 | 4.0K | /data/hawq/tmp/segment
(6 rows)

NOTE

"ON ALL/HOST" option is deprecated when creating a readable external table, as HAWQ cannot guarantee to schedule executors on a specific host. Instead, use ON MASTER, ON number, or SEGMENT virtual_segment to specify which segment instances will execute the command.

Please refer to the below article link for more details regarding the deprecation of the "ON ALL/HOST" option for HAWQ external tables:

https://hdb.docs.pivotal.io/200/hawq/reference/sql/CREATE-EXTERNAL-TABLE.html

Comments

Powered by Zendesk