Pivotal Knowledge Base

Follow

How to change the location of workfiles or spillfiles in HAWQ

Environment

Product Version
HAWQ 1.3.x
Pivotal Hadoop (PHD) 3.0
Ambari 1.7.1 / 2.1.2

Purpose

By default, HAWQ workfiles or spillfiles will go into /data/hawq/segments/gpseg0/base/1/pgsql_tmp. 

Post-installation, there may be a need to move the the location of pgsql_tmp to a dedicated disk or directory.

There is an option hawq.temp.directory in Ambari. However, this option is only used during the installation of HAWQ and will have no effect post-installation. This article will describe the steps that need to be taken to move pgsql_tmp.

Important Note: Each segment should have its own subdirectory to create its workfiles in order to avoid performance issues because of bloated filesystems and metadata overhead.

Procedure

This is the procedure to move pgsql_tmp to a different location. 

All commands should be run as user "gpadmin" on every master and segment host.

In this example, the new directory for workfiles or spillfiles will be /hawq/tmp/segment0/. Each master and segment should have its own subdirectory; for example, /hawq/tmp/segment1/ and /hawq/tmp/master/ to help avoid performance issues.

     1. Create the parent directory for pgsql_tmp (there is no need to create pgsql_tmp):

mkdir /hawq/tmp/segment0/

     2. The parent directory should be owned by gpadmin and have full permissions for the gpadmin user only:

chown gpadmin:gpadmin /hawq/tmp/segment0
chmod 700 /hawq/tmp/segment0/

     3. Obtain a list of all the segment and master directories on a host and locate the segment that will write its workfiles to/hawq/tmp/segment0/:

[gpadmin@dn1 ~]$ ps -eaf | grep silent | grep -v grep | awk '{print $10}'
/data/hawq/segments/gpseg0
/data/hawq/master/gpseg-1
[gpadmin@dn1 ~]$

     4. In this case, settings will be edited for the segment whose data directory is /data/hawq/segments/gpseg0.

Create a file /data/hawq/segments/gpseg0/gp_temporary_files_directories; the file should contain just one line, the path to the new location of the pgsql_tmp directory:

[root@dn1 gpseg0]# cat /data/hawq/segments/gpseg0/gp_temporary_files_directories
/hawq/tmp/segment0/
[root@dn1 gpseg0]#

     5. Complete these steps for every segment on every host, making sure that each segment will use a different directory.

     6. Restart HAWQ via Ambari for the change to take effect.

 

Comments

Powered by Zendesk