Pivotal Knowledge Base

Follow

Ambari Reports NameNode or DataNode as "Stopped" Even When it is Working Correctly

Environment

 Product  Version
 Pivotal HD / HDP 3.0.x / 2.4,2.5
 Ambari  1.7.1, 2.x

Symptom

Ambari reports the NameNode of DataNode as shown below:

 

However, all HDFS commands work correctly:  

[hdfs@amb171hawq ~]$ hdfs dfsadmin -report
Configured Capacity: 51636727808 (48.09 GB)
Present Capacity: 44497764352 (41.44 GB)
DFS Remaining: 44303142912 (41.26 GB)
DFS Used: 194621440 (185.61 MB)
DFS Used%: 0.44%
Under replicated blocks: 8
Blocks with corrupt replicas: 0
Missing blocks: 0 -------------------------------------------------
Live datanodes (1):

 

[hdfs@amb171hawq ~]$ hdfs dfs -ls /
Found 8 items
drwxrwxrwx - yarn hadoop 0 2016-03-10 16:10 /app-logs
drwxr-xr-x - gpadmin gpadmin 0 2016-03-10 16:08 /hawq_data
drwxr-xr-x - mapred hdfs 0 2016-03-10 16:08 /mapred
drwxr-xr-x - hdfs hdfs 0 2016-03-10 16:08 /mr-history
drwxr-xr-x - hdfs hdfs 0 2016-03-10 16:08 /phd
drwxr-xr-x - hdfs hdfs 0 2016-03-10 16:08 /system
drwxrwxrwx - hdfs hdfs 0 2016-05-20 01:49 /tmp
drwxr-xr-x - hdfs hdfs 0 2016-05-18 23:56 /user
[hdfs@amb171hawq ~]$

 

The PID in /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid or /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid is different than the actual PID of the data or name node process. For example:

[root@amb171hawq /]# cat /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid
112314
[root@amb171hawq /]# ps -eaf | grep namenode
hdfs 451896 1 0 13:13 ? 00:01:22 /usr/jdk64/jdk1.7.0_67/bin/java -Dproc_namenode -Xmx1024m -Dstack.name=phd -Dstack.version= -Djava.net.preferIPv4Stack=true -Dstack.name=phd -Dstack.version=

Resolution

1. Find the PID of the NameNode or DataNode with the issue (451896 in the example below):

[root@amb171hawq /]# ps -eaf | grep namenode
hdfs 451896 1 0 13:13 ? 00:01:22 /usr/jdk64/jdk1.7.0_67/bin/java -Dproc_namenode -Xmx1024m -Dstack.name=phd -Dstack.version= -Djava.net.preferIPv4Stack=true -Dstack.name=phd -Dstack.version=

2. Open the file /var/run/hadoop/hdfs/hadoop-hdfs-namenode.pid or /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid in a UNIX text editor such as VI.

3. Remove the existing PID from the file and replace it with the PID found in Step 1.

4. Restart Ambari agent: 

[root@amb171hawq /]# ambari-agent restart
Restarting ambari-agent
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Found ambari-agent PID: 536422
Stopping ambari-agent
Removing PID file at /var/run/ambari-agent/ambari-agent.pid
ambari-agent successfully stopped
Verifying Python version compatibility...
Using python /usr/bin/python2.6
Checking for previously running Ambari Agent...
Starting ambari-agent
Verifying ambari-agent process status...
Ambari Agent successfully started
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log
[root@amb171hawq /]#

5. Refresh the Ambari GUI page and the issue should be resolved.

  

Comments

Powered by Zendesk