Pivotal Knowledge Base

Follow

Queries in Pivotal HDB fail after enabling NameNode HA

Environment

Product Version
Pivotal Hadoop Database (Pivotal HDB) 2.x

Symptom

After enabling NameNode HA and failing over to the standby NameNode, queries fail in Pivotal HDB with one of the following errors:

gpadmin=# select * from test;
ERROR: cannot fetch block locations
DETAIL: Operation category READ is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1932)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getStats(NameNodeRpcServer.java:1128)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFsStats(ClientNamenodeProtocolServerSideTranslatorPB.java:695)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2200)
gpadmin=# \q

gpadmin=# create table test2(id int);
WARNING: could not remove relation directory 16385/16543/16549: Input/output error
CONTEXT: Dropping file-system object -- Relation Directory: '16385/16543/16549'
ERROR: could not create relation directory hdfs://hdm1.hdp.local:8020/hawq_default/16385/16543/16549: Input/output error
gpadmin=# \q

gpadmin=# select * from test;
ERROR: Append-Only Storage Read could not open segment file 'hdfs://hdm1.hdp.local:8020/hawq_default/16385/16543/16544/1' for relation 'test' (seg0 hdw2.hdp.local:40000 pid=6231)
DETAIL: Hdfs::HdfsRpcException: HdfsFailoverException: Failed to invoke RPC call "getFsStats" on server "hdm1.hdp.local:8020" Caused by: HdfsNetworkConnectException: Connect to "hdm1.hdp.local:8020" failed: (errno: 111) Connection refused
gpadmin=#

Cause

Pivotal HDB is set up incorrectly for NameNode HA, so the filespace location is pointing to a single NameNode rather than the Hadoop Distributed File System Service (HDFS) name. To confirm this, follow these steps:

1. Confirm that there is more than one NameNode by going to Ambari > Configs > Namenode:

2. In Ambari, under HDFS > Configs > Advanced > Advanced Core-site, locate the value of configuration fs.defaultFS.

3. Determine the file space location stored in Pivotal HDB; in this case, "hdm1.hdp.local:8020/hawq_default"

SELECT
    fsname, fsedbid, fselocation
FROM
    pg_filespace as sp, pg_filespace_entry as entry, pg_filesystem as fs
WHERE
    sp.fsfsys = fs.oid and fs.fsysname = 'hdfs' and sp.oid = entry.fsefsoid
ORDER BY
    entry.fsedbid;

fsname | fsedbid | fselocation
------------+---------+-----------------------------------------
dfs_system | 0 | hdfs://hdm1.hdp.local:8020/hawq_default
(1 row)

 4. As can be seen above, the fselocation is different to fs.defaultFS, which is causing the issue.  

Resolution

Review the Pivotal HDB documentation to configure Pivotal HDB correctly for NameNode HA. This will involve completing a number of manual steps via command line as well as some steps via Ambari. More specifically, the file space location will need to be updated in the Pivotal HDB catalog. 

 

 

Comments

Powered by Zendesk