Pivotal Knowledge Base

Follow

Setting up a kerberos cross realm trust for distcp

Environment

  • java 1.7
  • PHD 1.1.1
  • Kerberos 1.10.3-15

Kerberos Realm PROD.PIVOTAL.HADOOP

pccadmin-prod.phd.local    Kerberos KDC, Kerberos kadmin, Pivotal Command Center
hdm1-prod.phd.local   Namenode, Resourcemanager, Mapred History Server
hdw1-prod.phd.local   Datanode, Nodemanager
hdw2-prod.phd.local   Datanode, Nodemanager
hdw3-prod.phd.local   Datanode, Nodemanager

Kerberos Realm DEV.PIVOTAL.HADOOP

pccadmin-dev.phd.local    Kerberos KDC, Kerberos kadmin, Pivotal Command Center
hdm1-dev.phd.local   Namenode, Resourcemanager, Mapred History Server
hdw1-dev.phd.local   Datanode, Nodemanager
hdw2-dev.phd.local   Datanode, Nodemanager
hdw3-dev.phd.local   Datanode, Nodemanager

Purpose

If you have two independent PHD clusters in their own kerberos realm and you want to learn how to distcp some data between these clusters then this article is for you.  

NOTE

There are known issues with JDK 1.6 as per bug ID 7061379.  If your KDC is running version 1.10 or later then you must upgrade JDK to 1.7. 

Preparing the configuration files

Attached are sample configurations from both PROD and DEV clusters.  I suggest downloading them now and review them as we describe the settings defined in PROD cluster.  

/var/kerberos/krb5kdc/kdc.conf 

This file does not really have any specific settings related to cross realm trust configuration but it is important to take note of the contents of this file on both clusters.  The param that matters the most is the supported_enctypes.  When we create the krbtgt's to build the trust we want to make sure these encryption type are the same for both clusters as it could cause authentication issues if PROD supports AES256 and DEV only supports AES128.  Make sure these settings are the same throughout the clusters. 

[kdcdefaults]
  kdc_ports = 88
  kdc_tcp_ports = 88
  supported_enctypes = aes128-cts-hmac-sha1-96:normal des3-cbc-sha1:normal des-cbc-md5:normal des-cbc-crc:normal rc4-hmac:normal

[realms]
  PROD.PIVOTAL.HADOOP = {
    acl_file = /var/kerberos/krb5kdc/kadm5.acl
    dict_file = /usr/share/dict/words
    admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
    key_stash_file = /var/kerberos/krb5kdc/.k5.PROD.PIVOTAL.HADOOP
    kadmind_port = 749
    max_renewable_life = 7d
    supported_enctypes = aes128-cts-hmac-sha1-96:normal des3-cbc-sha1:normal des-cbc-md5:normal des-cbc-crc:normal rc4-hmac:normal
}

/etc/krb5.conf

This is the most important configuration file in the cluster.  I have omitted some of the irrelevant params for brevity.  Please open up the prod krb5.conf from the attachments to follow along.  Again in this example we have the default and supported encryption types.  The specific encryption types selected in this example are less important as they will vary between field deployments.  What is important is that the settings only included support encrypted types as previously defined by the kdc.conf.

[libdefaults]
  udp_preference_limit = 1
  default_realm = PROD.PIVOTAL.HADOOP
  dns_lookup_realm = false
  dns_lookup_kdc = false
  ticket_lifetime = 24h
  renew_lifetime = 7d
  forwardable = true
  default_tkt_enctypes = des3-cbc-sha1 rc4-hmac des-cbc-crc des-cbc-md5
  default_tgs_enctypes = des3-cbc-sha1 rc4-hmac des-cbc-crc des-cbc-md5
  permitted_enctypes = aes128-cts-hmac-sha1-96 des-cbc-crc des-cbc-md5 des3-cbc-sha1 rc4-hmac
  allow_weak_crypto = true

In the realms section we have to properly define both the PROD and DEV realms.  When configuring the PROD file you should be able to copy the DEV realm information from DEV's krb5.conf into PROD's.  Then do the same for DEV.  This ensures both clusters have the realm information defined. 

[realms]
  PROD.PIVOTAL.HADOOP = {
    kdc = pccadmin-prod.phd.local:88
    admin_server = pccadmin-prod.phd.local:749
    default_domain = PROD.PIVOTAL.HADOOP
  }
  DEV.PIVOTAL.HADOOP = {
    kdc = pccadmin-dev.phd.local:88
    admin_server = pccadmin-dev.phd.local:749
    default_domain = DEV.PIVOTAL.HADOOP
  }

When you are editing your own cluster the domain_realm section will include only two lines ".prod.pivotal.hadoop" and "prod.pivotal.hadoop".  The "." prefix tells kerberos to map all hosts in domain "prod.pivotal.hadoop" to realm PROD.PIVOTAL.HADOOP. 

In our case the hosts in PROD and DEV are in the phd.local domain so having these references for prod.pivotal.hadoop and dev.pivotal.hadoop do not really help in this lab environment.  In field deployments you will see that the domain name matches kerberos realm which makes these params more helpful.  

In order to allow the kerberos client to understand hdm1-dev.phd.local host is in the DEV.PIVOTAL.HADOOP realm we need to explicitly add a host mapping to domain_realm here.  For distcp to function you will need the namenode referenced here for both clusters.   One entry for PROD and one for DEV. 

[domain_realm]
  .prod.pivotal.hadoop = PROD.PIVOTAL.HADOOP
  prod.pivotal.hadoop = PROD.PIVOTAL.HADOOP
  .dev.pivotal.hadoop = DEV.PIVOTAL.HADOOP
  dev.pivotal.hadoop = DEV.PIVOTAL.HADOOP
  hdm1-dev.phd.local = DEV.PIVOTAL.HADOOP
  hdm1-prod.phd.local = PROD.PIVOTAL.HADOOP

Update /etc/gphd/hadoop/conf/core-site.xml

Now that we have completed our modifications to the kerberos configuration in both PROD and DEV and you should have synced the /etc/krb5.conf to all the cluster nodes.  We need to add this param to core-site.xml on all nodes.  These settings will also be the same on both PROD and DEV

<property>
<name>hadoop.security.auth_to_local</name>
<value>RULE:[1:$1@$0](.*@\QPROD.PIVOTAL.HADOOP\E$)s/@\QPROD.PIVOTAL.HADOOP\E$//
RULE:[2:$1@$0](.*@\QPROD.PIVOTAL.HADOOP\E$)s/@\QPROD.PIVOTAL.HADOOP\E$//
RULE:[1:$1@$0](.*@\QDEV.PIVOTAL.HADOOP\E$)s/@\QDEV.PIVOTAL.HADOOP\E$//
RULE:[2:$1@$0](.*@\QDEV.PIVOTAL.HADOOP\E$)s/@\QDEV.PIVOTAL.HADOOP\E$//
DEFAULT</value>
</property>

The hadoop.security.auth_to_local param lets the name node squeeze out the user identity from the users principal.  The format is as followed

RULE:[<principal translation>](<acceptance filter>)<short name substitution>
RULE:[<principal translation>](<acceptance filter>)<short name substitution>

To test your work you can run this command from the namenode.  Given a user principal the KadoopKerberosName class should be able to extract the gpadmin user. 

[gpadmin@hdm1-prod ~]$ hadoop org.apache.hadoop.security.HadoopKerberosName gpadmin/hdm1-dev.phd.local@DEV.PIVOTAL.HADOOP
Name: gpadmin/hdm1-dev.phd.local@DEV.PIVOTAL.HADOOP to gpadmin

[gpadmin@hdm1-prod ~]$ hadoop org.apache.hadoop.security.HadoopKerberosName gpadmin/hdm1-prod.phd.local@PROD.PIVOTAL.HADOOP
Name: gpadmin/hdm1-prod.phd.local@PROD.PIVOTAL.HADOOP to gpadmin

Setup the two way trust in both PROD and DEV

Now you need to launch the "kadmin.local" interface on the kadmin node.  By default after you have secured your cluster there will be a single krbtgt defined for the default realm

kadmin.local:  listprincs krb*
krbtgt/PROD.PIVOTAL.HADOOP@PROD.PIVOTAL.HADOOP

 We need to add two more tgts for the two way trust between PROD and DEV.  Please be sure to create these in both PROD and DEV KDC

kadmin.local:  addprinc -e "aes128-cts-hmac-sha1-96:normal des3-cbc-sha1:normal arcfour-hmac-md5:normal" krbtgt/PROD.PIVOTAL.HADOOP@DEV.PIVOTAL.HADOOP  

kadmin.local: addprinc -e "aes128-cts-hmac-sha1-96:normal des3-cbc-sha1:normal arcfour-hmac-md5:normal" krbtgt/DEV.PIVOTAL.HADOOP@PROD.PIVOTAL.HADOOP

In the example above we used "-e" to explicitly define the encryption types used for the this principal.  This is a good practice to ensure that all principals are using consistent encryption types and avoid some strange errors down the road.  After the principals are created you can verify the encryption types with this "getprinc krbtgt/DEV.PIVOTAL.HADOOP@PROD.PIVOTAL.HADOOP"

Now we're ready to test or possibly troubleshoot distcp

 Firstly log into prod and run kinit to give your user a tgt

[gpadmin@hdm1-prod ~]$ kinit
Password for gpadmin@PROD.PIVOTAL.HADOOP:
[gpadmin@hdm1-prod ~]$ klist -e
Ticket cache: FILE:/tmp/krb5cc_500
Default principal: gpadmin@PROD.PIVOTAL.HADOOP

Valid starting     Expires            Service principal
05/08/14 13:19:07  05/09/14 13:19:07  krbtgt/PROD.PIVOTAL.HADOOP@PROD.PIVOTAL.HADOOP
	renew until 05/15/14 13:19:07, Etype (skey, tkt): des3-cbc-sha1, aes128-cts-hmac-sha1-96

Now test hdfs for both PROD and DEV

[gpadmin@hdm1-prod ~]$ hdfs dfs -ls /
Found 6 items
drwxr-xr-x   - hdfs     hadoop           0 2014-05-08 10:26 /apps
drwxr-xr-x   - postgres gpadmin          0 2014-05-08 10:40 /hawq_data
drwxr-xr-x   - mapred   hadoop           0 2014-05-08 10:25 /mapred
drwxrwxrwx   - hdfs     hadoop           0 2014-05-08 10:53 /tmp
drwxrwxrwx   - hdfs     hadoop           0 2014-05-08 10:29 /user
drwxr-xr-x   - hdfs     hadoop           0 2014-05-08 10:26 /yarn
[gpadmin@hdm1-prod ~]$ hdfs dfs -ls hdfs://hdm1-dev.phd.local:8020/
Found 5 items
drwxr-xr-x   - hdfs   hadoop          0 2014-05-06 21:35 hdfs://hdm1-dev.phd.local:8020/apps
drwxr-xr-x   - mapred hadoop          0 2014-05-06 21:34 hdfs://hdm1-dev.phd.local:8020/mapred
drwxrwxrwx   - hdfs   hadoop          0 2014-05-08 11:11 hdfs://hdm1-dev.phd.local:8020/tmp
drwxrwxrwx   - hdfs   hadoop          0 2014-05-06 23:04 hdfs://hdm1-dev.phd.local:8020/user
drwxr-xr-x   - hdfs   hadoop          0 2014-05-06 21:34 hdfs://hdm1-dev.phd.local:8020/yarn

Here is how to test webhdfs, run this command from 1st cluster referring to the 2nd cluster or vice-versa.

[gpadmin@hdm1-prod ~]$ curl -L -i -v -u gpadmin --negotiate "hdm1-dev.phd.local:50070/webhdfs/v1/user?op=LISTSTATUS"
Enter host password for user 'gpadmin':
* About to connect() to hdm1-dev.phd.local port 50070 (#0)
*   Trying 172.28.16.108... connected
* Connected to hdm1-dev.phd.local (172.28.16.108) port 50070 (#0)
> GET /webhdfs/v1/user?op=LISTSTATUS HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.6.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: hdm1-dev.phd.local:50070
> Accept: */*
>
< HTTP/1.1 401
HTTP/1.1 401
< Date: Thu, 08 May 2014 20:30:01 GMT
Date: Thu, 08 May 2014 20:30:01 GMT
< Pragma: no-cache
Pragma: no-cache
< Date: Thu, 08 May 2014 20:30:01 GMT
Date: Thu, 08 May 2014 20:30:01 GMT
< Pragma: no-cache
Pragma: no-cache
< WWW-Authenticate: Negotiate
WWW-Authenticate: Negotiate
< Set-Cookie: hadoop.auth=;Path=/;Expires=Thu, 01-Jan-1970 00:00:00 GMT
Set-Cookie: hadoop.auth=;Path=/;Expires=Thu, 01-Jan-1970 00:00:00 GMT
< Content-Type: text/html;charset=ISO-8859-1
Content-Type: text/html;charset=ISO-8859-1
< Cache-Control: must-revalidate,no-cache,no-store
Cache-Control: must-revalidate,no-cache,no-store
< Content-Length: 1362
Content-Length: 1362
< Server: Jetty(7.6.10.v20130312)
Server: Jetty(7.6.10.v20130312)

<
* Ignoring the response-body
* Connection #0 to host hdm1-dev.phd.local left intact
* Issue another request to this URL: 'http://hdm1-dev.phd.local:50070/webhdfs/v1/user?op=LISTSTATUS'
* Re-using existing connection! (#0) with host hdm1-dev.phd.local
* Connected to hdm1-dev.phd.local (172.28.16.108) port 50070 (#0)
* Server auth using GSS-Negotiate with user 'gpadmin'
> GET /webhdfs/v1/user?op=LISTSTATUS HTTP/1.1
> Authorization: Negotiate YIICeAYJKoZIhvcSAQICAQBuggJnMIICY6ADAgEFoQMCAQ6iBwMFAAAAAACjggF3YYIBczCCAW+gAwIBBaEUGxJERVYuUElWT1RBTC5IQURPT1CiJTAjoAMCAQOhHDAaGwRIVFRQGxJoZG0xLWRldi5waGQubG9jYWyjggEpMIIBJaADAgERoQMCAQGiggEXBIIBE2g1skqABxojKr92z//Mag1DnaGGnXZJtFcT5Yri7TdstsFENps4NPRyMaoHcCb4bBmkiR0yRdXHLusJH6odeB2O4zOibSsO1QKnJqOnbQqbXKrVC1OHQdMqNKmDY7frwqn5l4hTfU78fWDf8s/gDvVQIIXITzD9uJ3BD8kmzK6LOrK5By9THuEEjvi3ospfg6vVRYundhgUo9U0WlMTQOAh4MgpKEm4rSUJTv5O91/7eqaLMbiOtN5HdxW70iKKTFuJ/jaYmT1nIIpcsrQz/rs+XDZmNyTdmiL8Kfn4g24L2D4n6LaejJEiYgNP301QrKCI4Qq+qy/LpdXCZAvyOWQjNC/8XBdUrI3ip4+sOevv2o6CpIHSMIHPoAMCARCigccEgcQJ/+qXUGzKiLjfn9AzJrtgYBXCLlPKlnRumSMTSGqEdBnte+gdkLxTcpf11b+/CVvHrW56vVmy0pHCUgaEnFIEn3Y36t+64WmYmndGiCa2f2G08C4oeCJ1AlpLh5J/5wrmimnYzh9RRcYGdXiaUIiZs9UxKpRreLy+uZcBk/ozstEgzNtGj3VeqHsyYvaN03nVKly612xXpgtNqhdu87mJ9VskQSrkf0x7WgqJ6uQbRihF4BXfS4gJobrF8y6B/JMMx73Y
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.6.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: hdm1-dev.phd.local:50070
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Thu, 08 May 2014 20:30:01 GMT
Date: Thu, 08 May 2014 20:30:01 GMT
< Pragma: no-cache
Pragma: no-cache
< Cache-Control: no-cache
Cache-Control: no-cache
< Date: Thu, 08 May 2014 20:30:01 GMT
Date: Thu, 08 May 2014 20:30:01 GMT
< Pragma: no-cache
Pragma: no-cache
< Set-Cookie: hadoop.auth="u=gpadmin&p=gpadmin@PROD.PIVOTAL.HADOOP&t=kerberos&e=1399617001107&s=rooZGDBCllgq+dFoQJliuc34LH4=";Path=/
Set-Cookie: hadoop.auth="u=gpadmin&p=gpadmin@PROD.PIVOTAL.HADOOP&t=kerberos&e=1399617001107&s=rooZGDBCllgq+dFoQJliuc34LH4=";Path=/
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Content-Type: application/json
Content-Type: application/json
< Transfer-Encoding: chunked
Transfer-Encoding: chunked
< Server: Jetty(7.6.10.v20130312)
Server: Jetty(7.6.10.v20130312)

<
{"FileStatuses":{"FileStatus":[
{"accessTime":0,"blockSize":0,"group":"hadoop","length":0,"modificationTime":1399558055538,"owner":"gpadmin","pathSuffix":"gpadmin","permission":"700","replication":0,"type":"DIRECTORY"},
{"accessTime":0,"blockSize":0,"group":"hadoop","length":0,"modificationTime":1399437287682,"owner":"mapred","pathSuffix":"history","permission":"777","replication":0,"type":"DIRECTORY"}
]}}
* Connection #0 to host hdm1-dev.phd.local left intact
* Closing connection #0

 Launch distcp to copy a file from PROD to DEV

hadoop distcp hdfs://hdm1-prod.phd.local:8020/tmp/data/data.dat hdfs://hdm1-dev.phd.local:8020/tmp/data/

 

 

Comments

Powered by Zendesk