Research

| | |

Comparing Paxos and Raft

Paxos and Raft are both consensus algorithms used to ensure consistency in distributed systems. While they solve similar problems, they have different approaches and design philosophies. Characteristics Paxos Roles: Proposers, Acceptors, Learners. Phases: Two main phases (Prepare/Promise and Propose/Accept). Leader Election: Not explicitly defined, often implemented using Multi-Paxos to handle multiple proposals efficiently. Use Cases:…

Sybil Attack 101

Distributed systems, such as peer-to-peer networks, , and other decentralized platforms, have become increasingly popular due to their potential to offer more robust, scalable, and secure solutions. However, these systems face unique challenges and vulnerabilities, one of which is the Sybil attack. Named after the psychiatric case study “Sybil,” in which a person exhibits multiple…

4 Features of Python 3.9 That You Can’t Take Your Eyes Off

Python is one of the most popular programming languages in the world. It is widely popular for a plethora of tasks due to its flexible nature and ease of use. Python has also managed to beat other programming languages such as Java, which were once upon a time, the world’s favorite. In fact, the extent…

Why do I need to run latex/bibtex three times to make everything look good?

Why does latex need to be executed 3 times like following? pdflatex main.tex bibtex main pdflatex main.tex pdflatex main.tex Copied from https://tex.stackexchange.com/questions/53235/why-does-latex-bibtex-need-three-passes-to-clear-up-all-warnings. The reason is as follows: 1, At the first latex run, all cite{…} arguments are written in the file document.aux. 2, At the bibtex run, this information is taken by bibtex and the…

When should the authors anonymize themselves in a paper submitted to a conference for review?

When should the authors anonymize themselves in a paper submitted to a conference for review? Several general concepts: Peer review is the evaluation of work by one or more people of similar competence to the producers of the work (peers). — Wikipedia Single-blind describes experiments where information that could introduce bias or otherwise skew the…

Consistency models for distributed systems

Which are the consistency models used for distributed systems? Papers that survey the consistency models Robert C. Steinke and Gary J. Nutt. 2004. A unified theory of shared memory consistency. J. ACM 51, 5 (September 2004), 800-849. DOI=10.1145/1017460.1017464 http://doi.acm.org/10.1145/1017460.1017464 David Mosberger. 1993. Memory consistency models. SIGOPS Oper. Syst. Rev. 27, 1 (January 1993), 18-26. DOI=10.1145/160551.160553…

SQL layers on NoSQL databases

What are the SQL layer solution over NoSQL databases such as key/value stores? Phoenix: A SQL layer on HBase: https://github.com/forcedotcom/phoenix They also show some performance results: https://github.com/forcedotcom/phoenix/wiki/Performance F1 – The Fault-Tolerant Distributed RDBMS Supporting Google’s Ad Business: http://research.google.com/pubs/pub38125.html With F1, we have built a novel hybrid system that combines the scalability, fault tolerance, transparent sharding,…

| | |

How to convert Managed Solution into Unmanaged for On-Premise CRM organisation?

Solution is very important part of Dynamics CRM. In order to deploy your customization, solution is the only bridge which help you to achieve your goal. There are two types of solutions available in CRM: Managed and Unmanaged. Managed Solutions: This is the solutions that you can import and publish only. You neither export it nor you can…

| | |

Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean

Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. You can download the slides from Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. These slides contain the “Numbers everyone should know” which everyone working on systems should be familiar with. Numbers Everyone Should Know L1 cache reference 0.5 ns Branch…

An I/O Performance Comparison Between loopback Backed and blktap Backed Xen File-backed VBD

I have done some I/O performance benchmark test of Xen DomU. For easier management, some of our DomU VMs are using file-backed VBDs. Previously, our VMs are using Loopback-mounted file-backed VBDs. But blktap-based support are recommended by Xen community. Before considering changing from loopback based VBD to blktap based VBD, I have done this performance…

|

Large-scale Data Storage and Processing System in Datacenters

Research on Cloud Computing has made big progresses and many excellent large-scale systems have been designed in recent years. I compiled a list of some large-scale data storage and processing systems in datacenters as follows. Storage systems Google File System (GFS): http://research.google.com/archive/gfs.html HDFS implementation: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html Colossus (GFS2): Colossus: Successor to the Google File System (GFS)…

Conference Ranking by Average Number of Citations in the Last 5 Years, 2012

I am trying to find out the top conferences that have the largest average number of citations in the last 5 years on the Internet but fail to find one. However, there are many rankings about the overall citations and numbers of publications. Hence, it is not hard to calculate the average number of citations…

| |

Hadoop Installation Tutorial (Hadoop 1.x)

Update: If you are new to Hadoop and trying to install one. Please check the newer version: Hadoop Installation Tutorial (Hadoop 2.x). Hadoop mainly consists of two parts: Hadoop MapReduce and HDFS. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce that is initially designed…