How to set the replication factor for one file when it is uploaded by `hdfs dfs -put` command line in HDFS?

When uploading a file by the hdfs dfs -put command line in HDFS, how to set a replication factor instead of the global one for that file?

For example, HDFS’s global replication factor is 3. For some temporary files, I would like to save just one copy for faster uploading and saving disk space.

The replication factor of files to be put by hdfs dfs -put is from the property dfs.replication from hdfs-site.xml.

The hdfs command allows you to overwrite the properties by the -D option.

Hence, to save a file by only one replica, you can use the command as follows.

hdfs dfs -Ddfs.replication=1 -put /path/to/local/file /path/to/hdfs/dir

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

Leave a Reply

Your email address will not be published. Required fields are marked *