What’s the difference between Reliability, Durability, and Availability for data storage system?

Some important concepts in distributed system like Hadoop distributed file system, Google file system and so on.

Answer from http://www.quora.com/Whats-the-difference-between-Reliability-Durability-and-Availability-for-data-storage-system

The difference between durability and availability is fairly simple. Durability is about what happens when all power goes out everywhere. Has all data been written to stable storage that doesn’t require power (e.g. disk/flash), in a form that allows it to be brought back and used? Availability is about what happens when there’s a partial failure – a disk, a node, a network. Does the system continue to operate and provide the same services it originally did?

Availability comes in multiple forms. To people who have worked on old-school HA – heartbeat, pairwise failover, address takeover, STONITH – it means system availability – the system as a whole continues to provide the original service. In a more recent CAP Theorem context it means node availability – the individual nodes (except those that have failed) continue to provide the original service. This precludes shutting down non-quorum nodes to preserve consistency, which is a common solution in the older HA world. This difference causes a lot of confusion, just like the C in CAP vs. the C in ACID, but it’s pretty well entrenched so you just have to keep the audience in mind when talking about availability.

Some people use “reliable” as a synonym for “available”. Some use it to distinguish system availability from node availability[1]. Some use it to mean ability to reach consensus despite faults[2] (basically C in CAP). Most people are just plain sloppy and don’t have any particular definition in mind (much like “fast” or “scalable”). Because there’s no consensus (heh) on its meaning, I’d suggest avoiding the term.

[1] 14 An Introduction to Distributed Systems http://webdam.inria.fr/Jorge/html/wdmch15.html
[2] Reliability of Distributed Systems http://www.cse.scu.edu/~jholliday/REL-EAR.htm

Similar Posts

  • How to get the metadata of an AWS S3 object?

    I upload files using the aws cli http://www.systutorials.com/239665/uploading-large-files-amazon-s3-aws-cli/ . But how to get the metadata of an object in AWS S3? You can use the s3api‘s head-object command to get the metadata of an object. Taking one example: $ aws s3api head-object –bucket test-hkust –key dir2/fileupload/fb0c6353-a90c-4522-9355-7cd16cf756ff.file.txt It will print results like { “AcceptRanges”: “bytes”, “ContentType”:…

  • Micosoft招聘部分算法题

    Micosoft招聘部分算法题 1.链表和数组的区别在哪里? 2.编写实现链表排序的一种算法。说明为什么你会选择用这样的方法? 3.编写实现数组排序的一种算法。说明为什么你会选择用这样的方法? 4.请编写能直接实现strstr()函数功能的代码。 5.编写反转字符串的程序,要求优化速度、优化空间。 6.在链表里如何发现循环链接? 7.给出洗牌的一个算法,并将洗好的牌存储在一个整形数组里。 8.写一个函数,检查字符是否是整数,如果是,返回其整数值。(或者:怎样只用4行代码编写出一个从字符串到长整形的函数?) 9.给出一个函数来输出一个字符串的所有排列。 10.请编写实现malloc()内存分配函数功能一样的代码。 11.给出一个函数来复制两个字符串A和B。字符串A的后几个字节和字符串B的前几个字节重叠。 12.怎样编写一个程序,把一个有序整数数组放到二叉树中? 13.怎样从顶部开始逐层打印二叉树结点数据?请编程。 14.怎样把一个链表掉个顺序(也就是反序,注意链表的边界条件并考虑空链表)? 来源:·日月光华 bbs.fudan.edu.cn Read more: Hashing Library for C How to install Scala on Linux Git Merging FAQs Notes for Beginners of Software Development on Linux Getting the running process’ own pid in Python How to iterate all dirs and files…

Leave a Reply

Your email address will not be published. Required fields are marked *