The HDFS has a configuration in hdfs-site.xml to set the global replication number of blocks with the “dfs.replication” property.
However, there are some “hot” files that are access by many nodes. How to increase the number of blocks for these certain files in HDFS?
You can the replication number of certain file to 10:
hdfs dfs -setrep -w 10 /path/to/file
You can also recursively set the files under a directory by:
hdfs dfs -setrep -R -w 10 /path/to/dir/
Usage: hdfs dfs -setrep [-R] [-w] <numReplicas> <path>
Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.
-w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
-R flag is accepted for backwards compatibility. It has no effect.
hdfs dfs -setrep -w 3 /user/hadoop/dir1
Returns 0 on success and -1 on error.