Pivotal Knowledge Base

Follow

Secondary NameNode Checkpoint Error "Inconsistent Checkpoint Fields"

Environment

 Product  Version

 Pivotal HDP, Hortonworks HDP

 2.x.x or Higher
 OS  RHEL (Any Release)
 Others: Maybe reported with Pivotal HDB (HAWQ)  HDB 2.x

Symptom

When the cluster is deployed without using NameNode High Availability and instead Secondary NameNode is used, Secondary NameNode checkpoints would fail, giving the following error:

Error Message:

java.io.IOException: Inconsistent checkpoint fields.
LV = -63 namespaceID = 713175558 cTime = 0 ; clusterId = CID-f2caf2b4-b3da-4a34-a62f-fea8badc724e ; blockpoolId = BP-716932340-192.168.1.35-1470839146699.
Expecting respectively: -63; 1785058013; 0; CID-4f13424a-2eb8-43d9-9ae1-9df54670f489; BP-1667795605-192.168.1.35-1465693123284.
        at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:134)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:531)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
        at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:449)
        at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
        at java.lang.Thread.run(Thread.java:745)

Note: These errors can show up in either the Secondary NameNode's .log file or its .out file.

Cause

There could be two different causes for this; either there could be a failed or improper Upgrade procedure or, it might result from the Secondary NameNode having an incorrect ${dfs.namenode.checkpoint.dir}/current/VERSION file. In the second scenario, everything under the SNN's ${dfs.namenode.checkpoint.dir} directory needs to be wiped out and rebuilt again so that checkpointing works again. 

Resolution

Follow these steps to resolve this issue:

  1. Via Ambari UI, select HDFS Service -> Configs
  2. Identify the directory value for parameters: dfs.namenode.checkpoint.dir and dfs.namenode.checkpoint.edits.dir
  3. Note the values down. NOTE: If the value of dfs.namenode.checkpoint.edits.dir uses a different directory value than dfs.namenode.checkpoint.dir, you must repeat the steps 10, 11 and 12 for that directory as well
  4. Stop all services except HDFS
  5. On the Primary NameNode host, put HDFS in SafeMode: hdfs dfsadmin -safemode enter
  6. On the same host, confirm SafeMode: hdfs dfsadmin -safemode get
  7. On the same host, checkpoint the NameSpace: hdfs dfsadmin -saveNamespace
  8. While still in SafeMode, shutdown remaining HDFS service(s)
  9. Log in to the Secondary NameNode host
  10. cd to the value of ${dfs.namenode.checkpoint.dir}
  11. mv current current.bad
  12. Start up HDFS service(s) only
  13. Wait for HDFS services to come online
  14. Start the remaining Hadoop Services

 

Comments

Powered by Zendesk