Tutorial

Setting Up Standalone (Local) Hadoop

ByEric Ma Apr 6, 2011Apr 5, 2016

Hadoop is designed to run on [[hadoop-installation-tutorial|hundreds to thousands of computers]] inside cluster. However, Hadoop is configured to run things in a non-distributed mode as a single Java process by default. This is specially useful for debugging since distributed debugging is really a nightmare. This post introduces how to set up a standalone Hadoop environment.

1. Hadoop package and software installation

Follow the instruction of “1. Install needed packages” part in [[hadoop-installation-tutorial|Hadoop Installation Tutorial]] to install packages. Fllow “4. Hadoop Concigurations” to configure hadoop-env.sh (this file only).

2. Just run Hadoop!

Just run hadoop jobs whose input and output is in local directories. We use a simple example to show how to start a Hadoop job.

The example finds and displays every match of the given regular expression. Output is written to the given output directory.

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-mapred-examples-0.21.0.jar grep input output '[a-z.]+'
$ cat output/*

The jar file’s name may be different depending on the Hadoop distribution’s version.

Is it simple? Enjoy it and go further to play [[hadoop-installation-tutorial|Fully-distributed Hadoop Installation]].

Linux | Programming | Tutorial

How to Get Available Filesystem Space on Linux: a C Function with a C++ Example

ByEric Ma Apr 14, 2015Aug 30, 2020

It is common for programs to write to files in filesystems on disks. However, what if the disk was almost full when your program writes to the filesystem on a disk? For systems software and mission critical programs, it is a better or must-to-do practice to check the available filesystem space before actually writing to…

Large-but-correctly-aligned-and-optimized code is faster than less-bytes-per-instruction/opcode-packed code

ByQ A Mar 24, 2018

Is large-but-correctly-aligned-and-optimized code faster than less-bytes-per-instruction/opcode-packed code? Alex Ionescu mentioned in ros-dev mailing list: I’m not sure why you would want kernel code to be “smaller” instead of “faster” though — on modern processors for cases like interrupts and such, large-but-correctly-aligned-and-optimized code is faster than less-bytes-per-instruction/opcode-packed code. ie: mov eax, [foo] add eax, 1 mov…

Linux

How to Take Screenshots in MPlayer

ByEric Ma Jul 13, 2013Sep 19, 2017

Taking screenshots in mplayer is simple. mplayer can also take continuous snapshots. Enable screenshot filter When we want to take screenshots when playing video, first we need to set the “-vf screenshot” option: $ mplayer -vf screenshot video.file If we want to enable the screenshot filter by default, we may put the option by adding…

How to get the time at millisecond level on Linux with command line?

ByEric Ma Mar 24, 2018Mar 24, 2018

I know that gettimeofday() is a nice API. But how to get the number of seconds with milliseconds since the epoch time? You can get the time at nano-seconds level (although it is not guaranteed that the last digits are accurate) by: date +%s.%N Read more: Setting sbt log level at the command line Why…

How to resize a virtual disk of KVM

ByWeiwei Jia Mar 24, 2018Jan 7, 2020

I test it for qcow2 format. Other formats are TBA. qemu-img resize kvm1.qcow2 +20G cp kvm1.qcow2 kvm1-orig.qcow2 virt-resize –expand /dev/sda1 kvm1-orig.qcow2 kvm1.qcow2 Reference: https://fatmin.com/2016/12/20/how-to-resize-a-qcow2-image-and-filesystem-with-virt-resize/ I test it for qcow2 format. Other formats are TBA. qemu-img resize kvm1.qcow2 +20G cp kvm1.qcow2 kvm1-orig.qcow2 virt-resize –expand /dev/sda1 kvm1-orig.qcow2 kvm1.qcow2 Reference: https://fatmin.com/2016/12/20/how-to-resize-a-qcow2-image-and-filesystem-with-virt-resize/ Read more: How to resize a batch…

How to get a free Web server SSL/TLS certificates for my websites?

ByEric Ma Mar 24, 2018Feb 28, 2020

Can I get a non-self-assigned and free Web server SSL/TLS certificates for my https websites? Asking the users to accept the self-assigned SSL certificates for my websites is not very convenient. Please check https://letsencrypt.org/ . PS: StartSSL used to provide 1 year free SSL. But StartCom CA is closed since Jan. 1st, 2018. Read more:…

1. Hadoop package and software installation

2. Just run Hadoop!

Similar Posts

Leave a Reply Cancel reply