Pivotal Knowledge Base

Follow

How long does a datanode have to be offline before the data starts replicating?

Environment

Product Version
Pivotal HD 3.x
HDFS  2.6.0

Purpose

The article explains how long a datanode has to be offline before the data blocks are re-replicated to other data nodes within the cluster. This can be useful to know if there is a need to avoid replication because of cluster load, capacity or locality (for example in the case of Pivotal HDB 1.x).

Procedure

By default, the datanode has to be unavailable (not sending a heartbeat to the namenode) for 10.5 minutes before the blocks on the unavailable datanode are replicated. The time to mark a datanode as dead is:

(dfs.namenode.heartbeat.recheck-interval * 2) + (10 * 1000 * heartbeatInterval)

If  the default values are applied to the above formula:

(300000 * 2) + (10 * 1000 * 3) = 630000 Milliseconds = 10.5 Minutes.

If there is a need to take up the time before replication starts this can be done by changing dfs.namenode.heartbeat.recheck-interval to a higher value by adding the configuration and value into AMBARI / HDFS / Configurations / Custom hdfs-site  

Comments

Powered by Zendesk