PHD - Hbase
Hbase list is throwing exception - ZK is null on connection event. The commands were working but there was a long delay before getting the below exception and the output of command.
hbase(main):001:0> list TABLE 14/09/30 12:22:07 ERROR zookeeper.ZooKeeperWatcher: ZK is null on connection event -- see stack trace for the stack trace when constructor was called on this zkw java.lang.Exception: ZKW CONSTRUCTOR STACK TRACE FOR DEBUGGING at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:143) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:127) 14/09/30 12:22:07 ERROR zookeeper.ClientCnxn: Error while calling watcher java.lang.NullPointerException: ZK is null at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:366) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
The issue listed above can be caused if there are long delays in resolving the zookeeper quorum nodes using the address listed in the hbase-site.xml. If the address used can be reached but delay is there, it could be due to DNS configuration.
In this case, /etc/resolv.conf had 3 IP entries for nameserver, and as these are tried in the order as listed, the first IP address for the nameserver was not responding and taking too long to move onto the second IP address. This delay was causing the exception listed.
nameserver 172.20.0.1 nameserver 188.8.131.52
You may quickly workaround the issue by changing the position of the bad nameserver to last or remove the entry for nameserver if it is not used to resolve the quorum nodes.
In case the nameserver entry is required for resolving the quoram nodes, you must identify the reason for slow resolution and fix it accordingly.