We enjoy readings.


Here are good books/ebooks that we hope you will find enjoy reading: SysTutorials Books.

Posts on the Web

Here is a collection of articles and news on scalable systems. The links are updated via RSS sources. You can subscribe to this page via RSS feed or by email.

  • Posted on Friday November 17, 2017
    Hey, it's HighScalability time: The BOSS Great Wall. The largest structure yet found in the universe. Contains 830 galaxies. A billion light years across. 10,000 times the mass of the Milky Way.   If you like this sort of Stuff then please support me on Patreon. And there's my new book, Explain the Cloud ... Continue Reading »
  • Posted on Monday November 13, 2017
    We at Instaclustr recently published a blog post on the most common data modelling mistakes that we see with Cassandra. This post was very popular and led me to think about what advice we could provide on how to approach designing your Cassandra data model so as to come up ... Continue Reading »
  • Posted on Friday November 10, 2017
    Hey, it's HighScalability time: Ah, the good old days. This is how the FBI stored finger prints in 1944. (Alex Wellerstein). How much data? Estimates range from 30GB to 2TB.   If you like this sort of Stuff then please support me on Patreon. Also, there's my new book, Explain the Cloud Like I'm ... Continue Reading »
  • Posted on Tuesday November 07, 2017
    Who's Hiring? Symbiont is a New York-based financial technology company building new kinds of computer networks to connect independent financial institutions together and allow them to share business logic and data in real time. This involves developing a distributed system which is also decentralized, and which allows for the creation of ... Continue Reading »
  • Posted on Monday November 06, 2017
    Kuhiro 10X Faster than Amazon Lambda  This is a guest post by Russell Sullivan, founder and CTO of Kuhirō. Serverless is an emerging Infrastructure-as-a-Service solution poised to become an Internet-wide ubiquitous compute platform. In 2014 Amazon Lambda started the Serverless wave and a few years later Serverless has extended to the ... Continue Reading »
  • Posted on Friday November 03, 2017
    Hey, it's HighScalability time: Luscious visualization of a neural network as a large directed graph. It's a full layout of the ResNet-50 training graph, a neural network with ~3 million nodes, and ~10 million edges, using Gephi for the graph layout, to output a 25000x25000 pixel image. (mattfyles)   If you like this sort of ... Continue Reading »
  • Posted on Friday October 27, 2017
    Hey, it's HighScalability time: How a mutex works pic.twitter.com/TwFLAVs2yd — ☠️💀👻Cody👻💀☠️ (@valarauca1) October 21, 2017 Perfect! Now, imagine a little dog snuck under Big Dog's cone of shame and covered the food with its own cone of shame, and it won't leave. That's deadlock. Imagine a stream of little dogs sneaking under Big ... Continue Reading »
  • Posted on Tuesday October 24, 2017
    Who's Hiring? Need excellent people? Advertise your job here! Fun and Informative EventsOn-demand Webinar. Fast & Frictionless - The Decision Engine for Seamless Digital Business. In this session, guest speakers Michele Goetz, Principal Analyst at Forrester Research and Matthias Baumhof, VP Worldwide Engineering at ThreatMetrix, discuss: How risk-based authentication leveraging digital identities ... Continue Reading »
  • Posted on Monday October 23, 2017
    This is a guest by Michele Palmia of @EyeEm. We’ve now been running computer vision models in production at EyeEm for more than three years - on literally billions of images. As an engineer involved in building the infrastructure behind it from scratch, I both enjoyed and suffered the many technical challenges this ... Continue Reading »
  • Posted on Monday October 23, 2017
    What is the cloud? Why is it called a cloud? How does the cloud work? What does it mean when something is 'in the cloud'? I wrote a new book: Explain the Cloud Like I'm 10, answering those questions for the complete beginner. It makes the perfect gift for Halloween. And Thanksgiving. And ... Continue Reading »
  • Posted on Saturday September 09, 2017
    The encoding of x86 and x86-64 instructions is well documented in Intel or AMD’s manuals. However, they are not quite easy for beginners to start with to learn encoding of the x86-64 instructions. In this post, I will give a list of useful manuals for understanding and studying the x86-64 ... Continue Reading »
  • Posted on Saturday September 09, 2017
    The metadata checkpointing in HDFS is done by the Secondary NameNode to merge the fsimage and the edits log files periodically and keep edits log size within a limit. For various reasons, the checkpointing by the Secondary NameNode may fail. For one example, HDFS SecondaraNameNode log shows errors in its ... Continue Reading »
  • Posted on Sunday August 27, 2017
    Introduction In general, if we want to debug Linux Kernel, there are lots of tools such as Linux Perf, Kprobe, BCC, Ktap, etc, and we can also write kernel modules, proc subsystems or system calls for some specific debugging aims. However, if we have to instrument kernel to achieve our goals, ... Continue Reading »
  • Posted on Saturday August 26, 2017
    Introduction As we know, network subsystems are important in computer systems since they are I/O systems and need to be optimized with many algorithms and skills. This article will introduce how QEMU/KVM [2] network part works. In order to put everything simple and easy to understand, we will begin with several ... Continue Reading »
  • Posted on Sunday August 20, 2017
    Abstract Most popular task monitor systems (such as top, iotop, proc, etc) can only get tasks’ disk I/O information like tasks’ I/O utilization percentage every seconds due to kernel timer/tick frequency and high time cost of system interfaces. This article presents I/O Microscopy, a new way to get tasks’ disk I/O ... Continue Reading »
  • Posted on Tuesday February 14, 2017
    Motivation Recently, I find it is hard to know the percentage of time that one process uses to wait for synchronous I/O (eg, read, etc). One way is to use the taskstats API provided by Linux Kernel [1]. However, for this way, the precision may be one problem. With this problem, ... Continue Reading »
  • Posted on Sunday November 29, 2015
    Amazon S3 is a widely used public cloud storage system. S3 allows an object/file to be up to 5TB which is enough for most applications. The AWS Management Console provides a Web-based interface for users to upload and manage files in S3 buckets. However, uploading a large files that is ... Continue Reading »
  • Posted on Tuesday March 10, 2015
    Retail is one of the most important business domains for data science and data mining applications because of its prolific data and numerous optimization problems such as optimal prices, discounts, recommendations, and stock levels that can be solved using data analysis methods. The rise of omni-channel retail that integrates marketing, ... Continue Reading »
  • Posted on Sunday September 14, 2014
    Hadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for ... Continue Reading »
  • Posted on Tuesday March 18, 2014
    Benchmarks are important to understand the performance and quantitative and qualitative comparison of different systems. Many analytic frameworks, such as Hive, Impala and Shark, are designed and implemented these years and become fundamental software for processing big data. How to benchmark these big data analytic systems is an interesting problem. The ... Continue Reading »
Please share if you like this post:


  1. Note for blog authors: if you do not want your articles appear here (we just post a excerpt, not the full content), please drop me a message and I will delete them. If you have good suggestions on blogs/sites (with a RSS feed) to add to this list, please also let me know.

  2. Yeah, the poll() function is broken on MacOS and therefore is not supported in Python for the Mac.The select library supports other polling mechanisms; it essentially exposes whatever the OS supports. Let me look into an update to the code that will use kevent on Macs.

Leave a Reply to Candelaria Marcinkowski Cancel reply

Your email address will not be published. Required fields are marked *