How to change number of replications of certain files in HDFS?

ByEric Ma Mar 24, 2018Mar 24, 2018

The HDFS has a configuration in hdfs-site.xml to set the global replication number of blocks with the “dfs.replication” property.

However, there are some “hot” files that are access by many nodes. How to increase the number of blocks for these certain files in HDFS?

You can the replication number of certain file to 10:

hdfs dfs -setrep -w 10 /path/to/file

You can also recursively set the files under a directory by:

hdfs dfs -setrep -R -w 10 /path/to/dir/

setrep

Usage: hdfs dfs -setrep [-R] [-w] <numReplicas> <path>

Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.

Options:

The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.

Example:

hdfs dfs -setrep -w 3 /user/hadoop/dir1

Exit Code:

Returns 0 on success and -1 on error.

How to process a file line by line in Python?

ByQ A Mar 24, 2018

In Python, how to process read a file line by line to process it? Like those lines in Bash: while read line ; do echo $line done < ./input.txt In Python, you can process a file line by line by a for in the file like with open(“./input.txt”, “r”) as thefile: for line in thefile:…

Free VNC server software on Windows

ByEric Ma Mar 24, 2018Mar 24, 2018

RealVNC only gives free version to personal usage of their server software while it limits the functions. Could you suggest some good free VNC server software with full functions? TightVNC is an open-source software for VNC with servers and clients. You can download the server software for Windows from TightVNC download page: http://www.tightvnc.com/download.php The software…

Programming

fclose – Close a Stream

ByEric Ma Jul 13, 2013Aug 27, 2013

fclose is a frequently used C standard library which closes the file associated with the stream and disassociates it. NAME fclose – close a stream SYNOPSIS #include <stdio.h> int fclose(FILE *fp); DESCRIPTION The fclose() function will flushes the stream pointed to by fp (writing any buffered output data using fflush()) and closes the underlying file…

In Python, `os.makedirs()` with 0777 mode does not give others write permission

ByEric Ma Mar 24, 2018Nov 21, 2019

In Python, os.makedirs() with 0777 mode can not give others write permission The code is as follows $ python Python 2.7.5 (default, Aug 4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2 Type “help”, “copyright”, “credits” or “license” for more information. >>> import os >>> os.makedirs(“/tmp/test1/test2”, 0777) >>> The created dirs are not…

What is the meaning of ‘@’ in Makefile?

ByEric Ma Mar 24, 2018Mar 24, 2018

Harry asked: What is the meaning of ‘@’ in Makefile? An example is as follows. $ cat Makefile all: @echo “For correctness test of basic get and put, run: make test;” Without ‘@’, it works well. Without @: $ make echo “For correctness test of basic get and put, run: make test;” For correctness test…

Linux | Tutorial

Forcing Linux to Unmount a Filesystem Reporting “device is busy”

ByEric Ma May 21, 2016Nov 1, 2020

Linux may report “device is busy” when we try to umount a filesystem. This behavior is reasonable as it can help us avoid data loss by disallowing unmouting a filesystem when it is being used. But for situations when we are sure there is something wrong happened or we care not data lost such as…

Similar Posts

Leave a Reply Cancel reply