Pivotal Knowledge Base


One time HDFS Protocol Installation for GPHDFS Access to the HDP 2.x Cluster


  • Pivotal Greenplum 4.3.x
  • Operating System- Red Hat Enterprise Linux 6.x
  • Hadoop (HDP) 2.x


This article describes standard settings for GPHDFS access to HDP 2.x Hadoop cluster. 


Incorrect settings of gp_hadoop_home cause an error while executing a query accessing GPHDFS external table.

gpadmin=# select count(*) from tmp_parq;
ERROR:  external table gphdfs protocol command ended with error. /usr/local/greenplum-db/./lib//hadoop/hadoop_env.sh: line 125: /bin/java: No such file or directory  (seg0 slice1 admin.hadoop.local:50000 pid=2818)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/TaskAttemptContext at java.lang.Class.getDeclaredMethods0(Native Method)at java.lang.Class.privateGetDeclaredMethods(Class.java:2570)at java.lang.Class.getMethod0(Class.java:2813)at 
at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
at sun.launch
Command: 'gphdfs://hdm1/tmp/sample/*.parquet' External table tmp_parq


This procedure assumes you have already installed the HDP2.x package on all hosts using standard HDP installation. 1. From each segment node, add below two entries to /home/gpadmin/.bashrc  -

 export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk- 
export HADOOP_HOME=/usr/hdp/current/hadoop-client/client

2. From Hawq(HDB) master node:

$ gpconfig -c gp_hadoop_home -v "'/usr/hdp/current/hadoop-client/client'
$ gpconfig -c gp_hadoop_target_version -v "'hdp2'
$ gpstop -u

Additional Information 

For general information on the HDFS protocol installation, please review the GPDB documentation

For installation with a PHD3.x cluster the following settings can be used:

$ gpconfig -c gp_hadoop_home -v "'/usr/phd/current/hadoop-client/client'
export HADOOP_HOME=/usr/phd/current/hadoop-client/client



Powered by Zendesk