Pivotal Knowledge Base

Follow

Yarn resourcemanager does not start after icm deployment with isilon

Environment

  • PCC 2.2.1
  • PHD 1.1.1.0
  • ISILON

Symptom

-bash-4.1$ icm_client start -l PHD_ISILON
Starting services
Starting cluster
[=================================                                                                   ] 33%

Return Code : 5000
Message : Cluster Start Error
Details :
Admin Host :
        Operation Error : Error while calling start for role yarn-resourcemanager. null
massh /tmp/tmp.eYvyVoscak verbose 'sudo /etc/init.d/hadoop-yarn-resourcemanager status || sudo /etc/init.d/hadoop-yarn-resourcemanager start'
{}
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
ERROR_RESPONSE{"RESOLUTION": "Please check component log file for more details.", "OPERATION_CODE": "COMPONENT_START_ERROR", "LOG_FILE": "/var/log/gphd/hadoop-yarn/hadoop-yarn-resourcemanager-hvm3.emc.net.log, /var/log/gphd/hadoop-yarn/hadoop-yarn-resourcemanager-hvm3.emc.net.out", "OPERATION_ERROR": "Failed to start component yarn-resourcemanager", "FAILED_HOSTS": "hvm3.emc.net"}

        Log File : /var/log/gphd/gphdmgr/gphdmgr-webservices.log


[====================================================================================================] 100%

Manually starting the resourcemanager reports these errors

[root@hvm3 init.d]# service hadoop-yarn-resourcemanager start
starting resourcemanager, logging to /var/log/gphd/hadoop-yarn/yarn-yarn-resourcemanager-hvm3.emc.cpt.adobe.net.out
log4j:WARN No appenders could be found for logger (org.apache.hadoop.yarn.server.resourcemanager.ResourceManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
                                                           [FAILED]

Some files in /etc/gphd/hadoop/conf are zero bytes after cluster deployment or reconfigure

[gpadmin@hvm3 ~]$ ls -l /etc/gphd/hadoop/conf/
total 60
-rwxrwxr-x 1 gpadmin gpadmin  3563 Jul 18 10:41 capacity-scheduler.xml
-rw-r--r-- 1 root    root      239 Jul 18 10:41 container-executor.cfg
-rw-r--r-- 1 root    root      861 Jul 17 14:23 core-site.xml
-rwxrwxr-x 1 gpadmin gpadmin     0 Jul 17 14:23 dfs.exclude
-rwxrwxr-x 1 gpadmin gpadmin     0 Jul 17 14:23 hadoop-env.sh
-rwxr-xr-x 1 gpadmin gpadmin     0 Jul 17 14:23 hadoop-metrics2.properties
-rwxr-xr-x 1 gpadmin gpadmin     0 Jul 17 14:23 hadoop-metrics.properties
-rwxr-xr-x 1 gpadmin gpadmin     0 Jul 17 14:23 hadoop-policy.xml
-rw-r--r-- 1 root    root     3366 Jul 18 10:41 hdfs-site.xml
-rwxr-xr-x 1 gpadmin gpadmin     0 Jul 18 10:41 log4j.properties
-rwxrwxr-x 1 gpadmin gpadmin  1376 Jul 18 10:41 mapred-env.sh
-rwxr-xr-x 1 gpadmin gpadmin  4116 Jul 18 10:41 mapred-queues.xml
-rw-r--r-- 1 root    root     2159 Jul 18 10:41 mapred-site.xml
-rwxrwxr-x 1 gpadmin gpadmin 15179 Jul 18 10:41 postex_diagnosis_tests.xml
-rw-r--r-- 1 root    root       11 Jul 17 14:23 slaves
-rwxrwxr-x 1 gpadmin gpadmin  2871 Jul 18 10:41 yarn-env.sh
-rwxrwxr-x 1 gpadmin gpadmin     0 Jul 17 14:23 yarn.exclude
-rw-r--r-- 1 root    root     2827 Jul 18 10:41 yarn-site.xml

Cause

This is a known issue when deploying PHD from PCC 2.2.1 with Isilon.  After deployment ICM does not honor the <isi-hdfs> tag in clusterclonfig.xml and skips syncing /etc/gphd/hadoop/conf files with cluster nodes.

Fix

Edit the /usr/lib/gphd/gphdmgr//lib/server/GPHDClusterInstaller.py on the pccadmin node adding condition for <isi-hdfs> xml tag. 

Change Line 33 from:

if (service["name"] == 'yarn') or (service["name"] == 'hdfs'):

Change Line 33 to:

if (service["name"] == 'yarn') or (service["name"] == 'hdfs') or (service["name"] == 'isi-hdfs'):

 

Internal JIRA Reference HD-10362

Comments

Powered by Zendesk