How to change an running HDFS cluster’s replication factor?

Now, I have a running HDFS cluster storing lost files. I want to change its default replication factor.

How to change it? What will happen after it is changed?

For example, I change from 2 to 3. Will HDFS automatically re-replicate the data chunks?

First, the replication factor is client decided.

Second, the replication factor is per-file configuration.

Hence, the configuration only changes the client and takes effect for new files.

For existing files, you need to manually re-set the replication factor:

https://www.systutorials.com/qa/1225/how-to-change-number-of-replications-of-certain-files-in-hdfs

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

Leave a Reply

Your email address will not be published. Required fields are marked *