Pivotal Knowledge Base

Follow

Hadoop daemons in a secured cluster fails to start with "Unable to obtain password from user"

Environment

Product Version
 Pivotal HD  
 Hadoop Cluster  

Symptom

In a secured Hadoop cluster, you can come across situations in which Hadoop daemons (namenode / datanode etc) may fail to start due to Kerberos authentication issues.

Daemons logs can help you identify the problem further. 

Error Message:

"Unable to obtain password from user"

Snippet from the namenode logs are below:

2014-03-27 17:57:57,904 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: Login failure for hdfs/dev6ha@SATURN.LOCAL from keytab /etc/security/phd/keytab/hdfs.service.keytab
        at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:836)
..
Caused by: javax.security.auth.login.LoginException: Unable to obtain password from user
        at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:789)
..
2014-03-27 18:15:33,186 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2014-03-27 18:15:33,188 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hdm1.saturn.local/10.246.67.243

Resolution

The below pointers can help you start the investigation, we have used the above log snippet values for example.

Identify what is the principal name?

: hdfs/dev6ha@SATURN.LOCAL

In hdfs-site.xml Kerberos principal name is specified which is used to authenticate against Kerberos. Verify, if the parameter dfs.namenode.kerberos.*.principal / dfs.datanode.kerberos.*.principal has correct principal name, depending upon the node role. If using _HOST variable in hdfs-site.xml, ensure that hostname -f is returning the fully qualified hostname and it matches with the principal name as indicated in the log files. If you have configuration issues (for example DNS returns an IP but in /etc/hosts there is a different IP specified for the same hosts); it will not replace _HOST with the correct name and you may see such errors. 

Note: In this example, DNS returned IP as 10.246.67.243, but /etc/hosts was pointing to 10.246.67.218, and _HOST was getting replaced by the nameservice name (dev6ha) instead of actual hostname because this was a NameNode High Availability configuration.

Identify what is the keytab file used? 

/etc/security/

/etc/security/

/etc/security/phd/keytab/hdfs.service.keytab

If the keytab file defined in hdfs-site.xml is not present you will see this error. So, please verify the path and the keytab filename.

Verify if you can kinit using the principal name and keytab?

[root@phd11-nn keytab] kinit -ket  /etc/security/phd/keytab/hdfs.service.keytab hdfs/dev6ha@SATURN.LOCAL

If kinit is failing then there might be a problem with the hostname IP mapping in your keytab file that are inconsistent with DNS or /etc/hosts, and you can still get the same error.

How to verify contents of keytab file:

klist -ket /etc/security/phd/keytab/hdfs.server.keytab

How to regenerate keytab file:

[root@KDC server] kadmin.local 
ktadd -norandkey -k /etc/security/keytab/hdfs-hostid.service.keytab  hdfs/host_fqdn@REALM  HTTP/host_fqdn@REALM

 

Additional Information

Identify how the hostname or the IP is determined?

DNS or using /etc/hosts you can check /etc/nsswitch.conf to identify which one is looked up first. There will be an entry like below indicating /etc/hosts file is used before looking up at DNS or vice-versa.

hosts:      files dns

Note: We will keep updating this document as we find more reasons for the same issue.

Comments

Powered by Zendesk