Pivotal Knowledge Base

Follow

UnknownHostException exception seen when using gphdfs and HDFS NameNode is configured for HA

Environment

Product Version
Pivotal HDP 2.4
Pivotal HDB 1.x / 2.x

Symptom

When accessing an external table in Pivotal HDB via the gphdfs protocol, the query fails with a UnknownHostException error.

This will happen when HDFS NameNode is configured for High Availability.

Error Message:

gpadmin=# select * from testhdfs10;
ERROR:  external table gphdfs protocol command ended with error. log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).  (seg0 slice1 gpdb-sandbox.localdomain:40000 pid=4322)
DETAIL:

log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: ns
    at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:411)
    at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
    at org.apache
Command: 'gphdfs://ns/tmp/testhdfs.txt'
External table testhdfs10, file gphdfs://ns/tmp/testhdfs.txt
gpadmin=#

Cause

When the HAWQ GUC gp_hadoop_home is set incorrectly, the jar necessary for HA operations (hadoop-hdfs.jar) cannot be found. 

Resolution

When following the one time HDFS protocol installation, make sure to use the correct value for gp_hadoop_home.

The value of gp_hadoop_home should be /usr/hdp/<VERSION NUMBER> and NOT /usr/hdp/current/, for example the following should be set:

1. In /home/gpadmin/.bashrc or /home/gpadmin/.bash_profile: 

 export HADOOP_HOME=/usr/hdp/2.4.2.0-258

2. From HAWQ (HDB) master node as user gpadmin:

$ gpconfig -c gp_hadoop_home -v "'/usr/hdp/2.4.2.0-258'" 
$ gpconfig -c gp_hadoop_target_version -v "'hdp2'
$ gpstop -u 

 

Comments

Powered by Zendesk