Pivotal Knowledge Base

Follow

Queries hang in Pivotal HDB after failing over to standby YARN Resource Manager

Environment

Product Version
Pivotal HDB (Hadoop Database) 2.0.x
Hortonworks HDP 2.4

Symptom

In the case of YARN High Availability (HA), there is an active and standby Resource Manager. In the case of a failover to the standby Resource Manager, the queries hang as Pivotal HDB is unable to request resources from the standby Resource Manager because of permissions issues.

Error Message:

When a Pivotal HDB query is hung, the following will be seen:

In the Pivotal HDB master log:

2016-06-16 13:01:19.821414 CEST,,,p638931,th186476672,,,,0,con4,,seg-10000,,,,,"WARNING","01000","YARN mode resource broker failed to create application in YARN resource manager. LibYarnClient::createJob, catch exception:AccessControlException: can not register application to YARN",,,,,,,0,,"resourcebroker_LIBYARN_proc.c",1321,
2016-06-16 13:01:19.821443 CEST,,,p638931,th186476672,,,,0,con4,,seg-10000,,,,,"WARNING","01000","YARN mode resource broker failed to register YARN application. Resource broker will retry soon.",,,,,,,0,,"resourcebroker_LIBYARN_proc.c",206,
2016-06-16 13:01:19.821453 CEST,,,p638931,th186476672,,,,0,con4,,seg-10000,,,,,"WARNING","01000","YARN mode resource broker failed to process request. Message id = 4, result = 712.",,,,,,,0,,"resourcebroker_LIBYARN_proc.c",273,

In the YARN Resource Manager log:

2016-06-16 13:01:19,823 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(148)) - USER=postgres IP=10.79.16.4 OPERATION=Submit Application Request TARG
ET=ClientRMService RESULT=SUCCESS APPID=application_1466072773496_0099
2016-06-16 13:01:19,837 WARN security.DelegationTokenRenewer (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(851)) - Unable to add the application to the delegation tok
en renewer.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): Unauthorized connection for super-user: rm/resource_manager_hostname@USERNAME from IP 10.10.10.101
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy88.getDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:933)
at sun.reflect.GeneratedMethodAccessor133.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy89.getDelegationToken(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:1043)
at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1542)
at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:530)
at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:508)
at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2228)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$2.run(DelegationTokenRenewer.java:645)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$2.run(DelegationTokenRenewer.java:640)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.obtainSystemTokensForUser(DelegationTokenRenewer.java:639)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.requestNewHdfsDelegationToken(DelegationTokenRenewer.java:603)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:465)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:78)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:847)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:828)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-06-16 13:01:19,837 INFO rmapp.RMAppImpl (RMAppImpl.java:rememberTargetTransitionsAndStoreState(1051)) - Updating application application_1466072773496_0099 with final state: FAILED
2016-06-16 13:01:19,837 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(768)) - application_1466072773496_0099 State change from NEW to FINAL_SAVING
2016-06-16 13:01:19,837 INFO recovery.RMStateStore (RMStateStore.java:transition(200)) - Updating info for app: application_1466072773496_0099
2016-06-16 13:01:19,839 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(768)) - application_1466072773496_0099 State change from FINAL_SAVING to FAILED

Cause

The error messages indicate that the user does not have access to request resources from the cluster.

Resolution

1. Correct the permissions via Ambari by adding in or modifying the configurations under Ambari / HDFS Services / Configs / Advanced / Customer core-site.xml:

hadoop.proxyuser.yarn.hosts = Confirm both resource Manager hosts are in here, separated by a colon.
hadoop.proxyuser.rm.hosts = *
hadoop.proxyuser.rm.groups = *

An example is shown below:

2. Restart any services requested by Ambari. 

3. Try the query again.

 

 

 

Comments

Powered by Zendesk