Pivotal Knowledge Base

Follow

"Default replication factor" & "Average block replication" in fsck output.

Environment

Product Version
Pivotal Hadoop 2.x ,3 .x
OS RHEL 6.x
Others  

Symptom

After lowering the default replication (dfs.replication) factor, the HDFS fsck output shows a discrepancy between "Default replication factor" and "Average block replication":

Status: HEALTHY
 Total size:    839729774 B
 Total dirs:    139
 Total files:   600
 Total symlinks:                0
 Total blocks (validated):      596 (avg. block size 1408942 B)
 Minimally replicated blocks:   596 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1

Cause

The "Default replication factor" defines the replication factor for new blocks. It does not change the replication factor for existing blocks.

When the HDFS was built, it was at replication factor of 3, so all the blocks that were added to HDFS were replicated at 3 and hence the "Average block replication" averages close to 3. Now if the "dfs.replication" parameter has been changed to 2 any new blocks coming to HDFS will be replicated only twice, so more the block added with new replication factor the average block replication will come down approximately to the dfs.replication value.

Resolution

This discrepancy is expected and is not an issue.

You can follow the article for information on how to modify the replication factor for the existing files in HDFS.

Once completed run the HDFS fsck again and the "average block replication" should match the "default replication factor":

Status: HEALTHY
 Total size:    839729774 B
 Total dirs:    139
 Total files:   600
 Total symlinks:                0
 Total blocks (validated):      596 (avg. block size 1408942 B)
 Minimally replicated blocks:   596 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    2
 Average block replication:     2.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1

Additional Information:

How to check the replication factor of the files in HDFS.

Comments

Powered by Zendesk