Which filesystem operations in HDFS is atomic?

ByQ A Mar 24, 2018Nov 25, 2019

Atomicity is a very important and fundamental property aspect of filesystems. Applications semantics and many functions depend on and only be available based on the atomicity models of the underlying filesystem. Which filesystem operations in HDFS is atomic? So that locks can be implemented on top of it. In a reasonably widely usable filesystem, some operations are expected to be atomic because they are often used to implement locking/exclusive access between processes using the filesystem. For HDFS, the processes are usually distributed in a cluster of servers instead of within a single OS.

As of Hadoop 3.2.1, the atomicity operations from a compatible filesystem (HDFS is one) are as follows. Most other operations come with no requirements or guarantees of atomicity.

– Creating a file. If the overwrite parameter is false, the check and creation MUST be atomic.
– Deleting a file.
– Renaming a file.
– Renaming a directory.
– Creating a single directory with mkdir().

Recursive directory deletion MAY be atomic. Although HDFS offers atomic recursive directory deletion, none of the other Hadoop FileSystems offer such a guarantee (including local FileSystems).

Tutorial

Hadoop Default Ports

ByEric Ma Jan 15, 2012Mar 27, 2018

Hadoop’s namenode and datanodes expose a bunch of TCP ports used by Hadoop’s daemons to communicate to each other or listen directly to users’ requests. These ports information are needed by both the Hadoop users and cluster administrators to write programs or configure firewalls/gateways accordingly. A post written by Philip Zeyliger from Cloudera’s blog summarizes the…

News

Conferences on Cloud Computing 2013

ByEric Ma Sep 1, 2012

This post lists important conferences related to Cloud Computing in year 2013. SOSP 2013 SOSP’13: The 24th ACM Symposium on Operating Systems Principles. November 3-6, 2013, Nemacolin Woodlands Resort, Pennsylvania. The biennial ACM Symposium on Operating Systems Principles is the world’s premier forum for researchers, developers, programmers, and teachers of computer systems technology. Academic and…

Linux | Virtualization

Setting Up Xen Dom0 on Fedora : Xen 3.4.1 with Linux Kernel 2.6.29 on Fedora 12

ByEric Ma Jul 13, 2013Sep 6, 2020

Please refer to for the latest stable Xen Dom0 solution. In this post, the detailed tutorial for setting up Xen 3.4.1 dom0 on top of Fedora 12 with kernel 2.6.29 will be introduced. Hardware: Dom0 hardware platform: Motherboard: INTEL S5500BC S5500 Quad Core Xeon Server Board CPU: 2 x Intel Quad Core Xeon E5520 2.26G…

Insights

What is the Future of Big Data Analytics and Hadoop?

Bymanchun Jul 18, 2019Nov 21, 2019

Big Data has taken a lead in the IT industry and has played a significant role in the Business growth and decision-making processes that gives you an edge over the competitors. This is equally applicable to the organizations as well as professionals existing in the analytics domain. Big Data Analytics bring an ocean of opportunities…

How to install and configure a MySQL cluster on CentOS/RHEL 6.3?

ByQ A Mar 24, 2018Oct 7, 2019

Any good tutorial on how to install and configure a MySQL cluster on CentOS/RHEL 6.3? Check these posts: Installing MySQL Cluster on CentOS 6.3Configuring the MySQL Cluster General tutorials: MySQL Cluster Installation and CentOS 6: Install MySQL Cluster – The Simple Way. Read more: How to configure ifcfg-eth0 network scripts on Fedora/RHEL Linux? How To…

Computing systems | Resource management | Storage systems | Systems | Tutorial

Hadoop Installation Tutorial (Hadoop 2.x)

ByEric Ma Sep 14, 2014Dec 29, 2019

Hadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for processing and generating large data…

Similar Posts