Do big data stream processing in the stream way

Posted on

Reading: Years in Big Data. Months with Apache Flink. 5 Early Observations With Stream Processing: https://data-artisans.com/blog/early-observations-apache-flink. The article suggest adopting the right solution, Flink, for big data processing. Flink is interesting and built for stream processing. The broader view and take away may be to solve problems using the right solution. We saw many painful
Read more

How to handle missing blocks and blocks with corrupt replicas in HDFS?

Posted on

One of HDFS cluster’s hdfs dfsadmin -report reports: Under replicated blocks: 139016 Blocks with corrupt replicas: 9 Missing blocks: 0 The “Under replicated blocks” can be re-replicated automatically after some time. How to handle the missing blocks and blocks with corrupt replicas in HDFS? Understanding these blocks A block is “with corrupt replicas” in HDFS
Read more

HDFS stays in safe mode because of reported blocks not reaching 0.9990 of total blocks

Posted on

After a node failure and restarting the HDFS, the NameNode reports: “The reported blocks 1968810 needs additional 5071 blocks to reach the threshold 0.9990 of total blocks 1975856. Safe mode will be turned off automatically.” in the log. Why this happens? And how to fix it? About why the NameNode stays in the safe mode:
Read more

How to understand some key system consistency algorithoms

Posted on

When we design a system, we may want our systems to be consistency, scalability and so on. Currently, there are some famous consistency algorithms. How to understand them easily. 1, Paxos and its extensions 2, Replicated State Machine mechanisms 3, Quorum Welcome to adding other famous consistency algorithms and its understanding ;-) Reading text books
Read more

What’s the difference between Reliability, Durability, and Availability for data storage system?

Posted on

Some important concepts in distributed system like Hadoop distributed file system, Google file system and so on. Answer from http://www.quora.com/Whats-the-difference-between-Reliability-Durability-and-Availability-for-data-storage-system The difference between durability and availability is fairly simple. Durability is about what happens when all power goes out everywhere. Has all data been written to stable storage that doesn’t require power (e.g. disk/flash), in
Read more

Consistency models for distributed systems

Posted on

Which are the consistency models used for distributed systems? Papers that survey the consistency models Robert C. Steinke and Gary J. Nutt. 2004. A unified theory of shared memory consistency. J. ACM 51, 5 (September 2004), 800-849. DOI=10.1145/1017460.1017464 http://doi.acm.org/10.1145/1017460.1017464 David Mosberger. 1993. Memory consistency models. SIGOPS Oper. Syst. Rev. 27, 1 (January 1993), 18-26. DOI=10.1145/160551.160553
Read more

Transactional memory learning materials

Posted on

I want to learn transactional memory technologies. Any suggestions on Transactional memory learning materials? Thanks! I highly suggest the Transactional Memory lecture by James R. Larus and Ravi Rajwar of Synthesis Lectures on Computer Architecture: The Transactional Memory lecture:http://www.morganclaypool.com/doi/abs/10.2200/S00070ED1V01Y200611CAC002 Link to the PDF:http://www.morganclaypool.com/doi/pdf/10.2200/S00070ED1V01Y200611CAC002

Notes for Beginners of Software Development on Linux

Posted on

Linux is a great platform for software development targeting servers or backends. In general, working on Linux is very productive. The problem that beginners on Linux face is the the learning curve is steep at the beginning. But believe me, after you get through the initial green steep learning step as in the figure below
Read more

Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean

Posted on

Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. You can download the slides from Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. These slides contain the "Numbers everyone should know" which everyone working on systems should be familiar with. Numbers Everyone Should Know L1 cache reference 0.5 ns Branch
Read more

Reading List for Distributed Systems and Cloud Computing

Posted on

Understanding the literature is usually the first step to do research, which is the same for systems research on cloud computing. A reading list may help a lot to those that just start in cloud computing research. Prof. Lin Gu, my PhD supervisor, compiled a reading list for system research on cloud computing. The reading
Read more

Conferences on Cloud Computing 2013

Posted on

This post lists important conferences related to Cloud Computing in year 2013. SOSP 2013 SOSP’13: The 24th ACM Symposium on Operating Systems Principles. November 3-6, 2013, Nemacolin Woodlands Resort, Pennsylvania. The biennial ACM Symposium on Operating Systems Principles is the world’s premier forum for researchers, developers, programmers, and teachers of computer systems technology. Academic and
Read more

Conferences on Cloud Computing 2012

Posted on

This post lists important conferences on Cloud Computing in year 2012. OSDI 2012 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’12) October 8–10, 2012, Hollywood, CA “The tenth OSDI seeks to present innovative, exciting research in computer systems. OSDI brings together professionals from academic and industrial backgrounds in what has become a
Read more