How to check the replication factor of a file in HDFS?

A related question: how to find the replication factors of files in a HDFS cluster?

method 1: You can use the HDFS command line to ls the file.

The second column of the output will show the replication factor of the file.

For example,

$ hdfs dfs -ls  /usr/GroupStorage/data1/out.txt
-rw-r--r--   3 hadoop zma 11906625598 2014-10-22 18:35 /usr/GroupStorage/data1/out.txt

The out.txt’s replication factor is 3.

method 2: Get the replication factor using the stat hdfs command tool.

Using the above file as an example:

$ hdfs dfs -stat %r /usr/GroupStorage/data1/out.txt

It will print 3.

Eric Ma

Eric is a systems guy. Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

2 comments:

  1. Hello Eric,

    I want to find out all the files having replication factor of 1 and get that changed to 3.
    I am unable to get the completed path of these file and directories hence I am unable to change it, would there be a way to get list (including complete path) of all these files with RF 1 so that I can change the replication to 3.

    Regards
    Wert.

Leave a Reply

Your email address will not be published. Required fields are marked *