Cluster

QA

How to install and configure a MySQL cluster on CentOS/RHEL 6.3?
ByQ A Mar 24, 2018Oct 7, 2019

Any good tutorial on how to install and configure a MySQL cluster on CentOS/RHEL 6.3? Check these posts: Installing MySQL Cluster on CentOS 6.3Configuring the MySQL Cluster General tutorials: MySQL Cluster Installation and CentOS 6: Install MySQL Cluster – The Simple Way.

Read More How to install and configure a MySQL cluster on CentOS/RHEL 6.3?
Programming | Tutorial | Web

Quartz Implementation in Java
ByAaronjacobson Aug 16, 2016Aug 30, 2020

In this post, java development India based experts will explain the concept of Quartz. You will also learn the method of setting up the Quartz in this article. You can ask experts if anything bothers you. Technology Quartz is the open source Java technology for scheduling background jobs. If we want to execute the task…

Read More Quartz Implementation in Java
Tutorial

Three Methods of Executing Commands on Many Nodes in Parallel via SSH on Linux
ByEric Ma Jan 3, 2016Aug 30, 2020

It is common to execute commands on many nodes/hosts via SSH for managing a cluster of Linux servers. On Linux, there are many choices for this task. Generally, to run commands on many nodes, there are two modes: serial mode and parallel mode. In serial mode, the command is executed on the node one by…

Read More Three Methods of Executing Commands on Many Nodes in Parallel via SSH on Linux
Linux | Network | Tutorial

Lazy Linux Admins Going to Server Rooms Less: Forced Reboot, Auto Reboot after Kernel Panic and Email Notification after Reboot
ByEric Ma Feb 2, 2015Aug 30, 2020

Having to go the the server room to reset servers is the most headache thing for admins managing a cluster of Linux servers in a remote site. Either you can ping the server but can not ssh to it, or you even can not ping it. There are various reasons that may cause a Linux…

Read More Lazy Linux Admins Going to Server Rooms Less: Forced Reboot, Auto Reboot after Kernel Panic and Email Notification after Reboot
Computing systems | Resource management | Storage systems | Systems | Tutorial

Hadoop Installation Tutorial (Hadoop 2.x)
ByEric Ma Sep 14, 2014Dec 29, 2019

Hadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for processing and generating large data…

Read More Hadoop Installation Tutorial (Hadoop 2.x)
Linux | Network | Software | Tutorial

Improving ssh/scp Performance by Choosing Suitable Ciphers
ByEric Ma Apr 24, 2014Aug 30, 2020

Update on Oct. 9, 2014: You should be aware of the possible security problems of blowfish and it is suggested not to be used. Instead, you may consider ChaCha20 as suggested by Tony Arcieri. To use this with OpenSSH, you need to specify the Ciphers in your .ssh/config files as chacha20-poly1305@openssh.com possibly with another default…

Read More Improving ssh/scp Performance by Choosing Suitable Ciphers
Virtualization

Script: Shutting Down All Xen VMs on a Server
ByEric Ma Jul 13, 2013Aug 30, 2020

Shutting down servers is a common operations for managing a cluster. However, if this server is configured to a Xen Dom0 and has Xen VMs (DomUs), the VMs should be shutdown first to avoid data lost on these VMs. xm supports a -a option to shutdown all VMs: # xm shutdown -a Add the -w…

Read More Script: Shutting Down All Xen VMs on a Server
Linux

Script: Checking Alive Servers from a Server List
ByEric Ma Jul 13, 2013Aug 30, 2020

With a list of servers, it is common that one or more are down or crash. Lots cluster management tools can detect the aliveness of servers. However, it can be easily done with ping with a Bash script. I summarize the script that I used and share it here: check-alive-server.sh. Usage: usage: ./check-alive-server.sh file Each…

Read More Script: Checking Alive Servers from a Server List
Linux | Network

How to Flush iptables on Fedora Linux
ByEric Ma Jul 13, 2013Apr 1, 2020

iptables is a mechanism in Linux kernel for port forwarding, NAT, firewalls etc. In Linux distros, such as Fedora, the iptables is configured to be as a “strict” firewall that opens a limited know ports, such as 22 for SSH. However, in some network environment, such as a private cluster, the nodes are trusted and…

Read More How to Flush iptables on Fedora Linux
Linux | Virtualization

Setting Up Ubuntu DomU on Xen: Ubuntu 10.10 on Fedora Xen Dom0
ByEric Ma Jul 13, 2013Apr 1, 2020

Setting up Ubuntu 10.10 DomU on top of Fedora Xen Dom0 is introduced in this post. The process of setting up Ubuntu 10.10 DomU is the same as Setting Up Stable Xen DomU with Fedora: Unmodified Fedora 12 on top of Xenified Fedora 12 Dom0 with Xen 4.0 This post only show the difference which…

Read More Setting Up Ubuntu DomU on Xen: Ubuntu 10.10 on Fedora Xen Dom0
Linux

Linux Cluster Solutions
ByEric Ma Jul 13, 2013Sep 27, 2014

Solutions to Linux cluster construction and management such as unified account management, NFS home directory, network configurations are summarised in this post. The post is keeping updating while new solutions is added to this site. ===Account and storage management=== [[unified-linux-login-and-home-directory-using-openldap-and-nfsautomount|Unified Linux Login and Home Directory Using OpenLDAP and NFS/automount]] [[backup-linux-home-directory-using-rsync|Backup Linux Home Directory Using rsync]]…

Read More Linux Cluster Solutions
Linux | Virtualization

Xen DomU’s I/O Performance of LVM and loopback Backed VBDs
ByEric Ma Jul 13, 2013

This posts list benchmark (using bonnie++) result of I/O performance of Xen LVM and loopback backed VBDs. The configuration of machines Dom0 VCPU: 2 (Intel(R) Xeon(R) CPU E5520 @ 2.27GHz) Memory: 2GB Xen and Linux kernel: Xen 3.4.3 with Xenified 2.6.32.13 kernel DomU VCPU: 2 Memory: 2GB Linux kernel: Fedora (2.6.32.19-163.fc12.x86_64) DomU’s profile: name=”10.0.1.200″ vcps=2…

Read More Xen DomU’s I/O Performance of LVM and loopback Backed VBDs
Virtualization

Unified Xen DomU configuration file
ByEric Ma Jul 13, 2013Aug 23, 2020

Previously, we create a configuration file for each DomU virtual machines in our cluster. Most of the content in these configuration files is the same. The differences are only the name, memory size and image file address. There are several disadvantages of this method: We must create and configure a new configuration file when creating…

Read More Unified Xen DomU configuration file
Virtualization

Automatically Backing Up Xen File-backed DomU
ByEric Ma Jul 13, 2013Aug 23, 2020

A script for backing up file-backed Xen DomU is introduced in this post. This script can be changed to similar platform. In our cluster, virtual machines are stored under /lhome/xen/. Virtual machine with id vmid is stored in directory vmvmid. The raw image disk file name can also be derived from vmid. Some more details…

Read More Automatically Backing Up Xen File-backed DomU
Storage systems | Systems

Colossus: Successor to the Google File System (GFS)
ByEric Ma Nov 29, 2012Aug 2, 2020

Colossus is the successor to the Google File System (GFS) as mentioned in the paper on Spanner at OSDI 2012. Colossus is also used by spanner to store its tablets. The information about Colossus is slim compared with GFS which is published in the paper at SOSP 2003. There is still some information about Colossus…

Read More Colossus: Successor to the Google File System (GFS)
Computing systems | Storage systems | Systems

Hadoop Installation Tutorial (Hadoop 1.x)
ByEric Ma Oct 9, 2012Nov 28, 2020

Update: If you are new to Hadoop and trying to install one. Please check the newer version: Hadoop Installation Tutorial (Hadoop 2.x). Hadoop mainly consists of two parts: Hadoop MapReduce and HDFS. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce that is initially designed…

Read More Hadoop Installation Tutorial (Hadoop 1.x)
Tutorial

Reading List for Distributed Systems and Cloud Computing
ByEric Ma Sep 15, 2012Aug 30, 2020

Understanding the literature is usually the first step to do research, which is the same for systems research on cloud computing. A reading list may help a lot to those that just start in cloud computing research. Prof. Lin Gu, my PhD supervisor, compiled a reading list for system research on cloud computing. The reading…

Read More Reading List for Distributed Systems and Cloud Computing
Tutorial

Hadoop Default Ports
ByEric Ma Jan 15, 2012Mar 27, 2018

Hadoop’s namenode and datanodes expose a bunch of TCP ports used by Hadoop’s daemons to communicate to each other or listen directly to users’ requests. These ports information are needed by both the Hadoop users and cluster administrators to write programs or configure firewalls/gateways accordingly. A post written by Philip Zeyliger from Cloudera’s blog summarizes the…

Read More Hadoop Default Ports
Tutorial

Pitfalls and Lessons on Configuing and Tuning Hadoop
ByEric Ma Apr 26, 2011Mar 27, 2018

This post lists pitfalls and lessons learning when configuring and tuning Hadoop. Hadoop with IPv6 Hadoo doesn’t support IPv6 currently (up to 0.20.2 and 0.21.0): Hadoop and IPv6. The performance of the cluster may suffer from turning IPv6 on in clusters: mail archive. One good practice is to disable IPv6 on servers in the Hadoop…

Read More Pitfalls and Lessons on Configuing and Tuning Hadoop
Tutorial

Setting Up Standalone (Local) Hadoop
ByEric Ma Apr 6, 2011Apr 5, 2016

Hadoop is designed to run on [[hadoop-installation-tutorial|hundreds to thousands of computers]] inside cluster. However, Hadoop is configured to run things in a non-distributed mode as a single Java process by default. This is specially useful for debugging since distributed debugging is really a nightmare. This post introduces how to set up a standalone Hadoop environment….

Read More Setting Up Standalone (Local) Hadoop