关于hadoop的dfs.replication 的一个问题
今天在使用hadoop fsck / 命令查看hadoop dfs时,输出如下:
.............................................
/user/hadoop/.staging/job_1381991904684_0036/libjars/zookeeper-3.4.5-cdh4.3.0.jar: Under replicated BP-2044520431-132.35.141.65-1381473011645:blk_-7907774648029476743_40033. Target Replicas is 10 but found 4 replica(s).
......................................................
....................................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
...................................................Status: HEALTHY
Total size:4583923103 B
Total dirs:2807
Total files:11151 (Files currently being written: 4)
Total blocks (validated):11165 (avg. block size 410561 B)
Minimally replicated blocks:11165 (100.0 %)
Over-replicated blocks:0 (0.0 %)
Under-replicated blocks:26 (0.23287058 %)
Mis-replicated blocks:0 (0.0 %)
Default replication factor:2
Average block replication:2.0206
Corrupt blocks:0
Missing replicas:156 (0.68674064 %)
Number of data-nodes:4
Number of racks:2
FSCK ended at Fri Oct 18 09:58:48 CST 2013 in 1237 milliseconds
副本缺失率为0.68674064 %,在副本数一下的块数有26块,这26个块日志输入分别为Target Replicas is 10 but found 4 replica(s). 这个可能是由于前期的原因造成的。 我目前的集群设置的副本数为2,可以使用下列命令将这些块的副本数更改了:
hadoop fs -setrep -R 2 /user/hadoop/.staging
注意一点:一个文件,上传到hdfs上时指定的是几个副本就是几个。以后你修改了副本数,对已经上传了的文件也不会起作用。