How to balance DataNode storage in HDFS?

As nodes are added and deleted in a Hadoop cluster. Storage usage across DataNodes may be different. Some DataNodes’ disks are almost used up while some others’ are almost empty.

How to balance data across DataNodes in HDFS?

Hadoop provides the balancer to redistribute the data.

Brief introduction to balancer in Hadoop: balancer.

The design and discussion of balancer in Hadoop: HADOOP-1652.

The command to start balancer: hadoop balancer as the administrator.

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

Leave a Reply

Your email address will not be published. Required fields are marked *