Readings on Systems

We enjoy readings.

Books

Here are good books/ebooks that we hope you will find enjoy reading: SysTutorials Books.

Posts on the Web

Here is a collection of articles and news on scalable systems. The links are updated via RSS sources. You can subscribe to this page via RSS feed or by email.

  • Posted on Friday February 24, 2017
    Hey, it's HighScalability time: Great example of Latency As A Pseudo-Permanent Network Partition. A slide effectively cleaved Santa Cruz from the North Bay by slowing traffic to a crawl.If you like this sort of Stuff then please support me on Patreon.40 TFLOPS: on Lambda; 7: new habitable planets with good beer; dozens: balloons ... Continue Reading »
  • Posted on Monday February 20, 2017
    HelloFresh keeps growing every single day: our product is always improving, new ideas are popping up from everywhere, our supply chain is being completely automated. All of this is simply amazing us, but of course this constant growth brings many technical challenges. Today I’d like to take you on a ... Continue Reading »
  • Posted on Sunday February 19, 2017
    Hey, it's HighScalability time: Gorgeous satellite images of a thawing Greenland (NASA).If you like this sort of Stuff then please support me on Patreon.1 cubic millimeter: computer with deep-Learning; 1,600: data on nearby stars; 40M: users for largest Parse app; 58x: Tensorflow 1.0 speedup on 64 gpus; 46%: ecommerce controlled by Amazon; 60%: IT growth in public cloud; ... Continue Reading »
  • Posted on Tuesday February 14, 2017
    Motivation Recently, I find it is hard to know the percentage of time that one process uses to wait for synchronous I/O (eg, read, etc). One way is to use the taskstats API provided by Linux Kernel [1]. However, for this way, the precision may be one problem. With this problem, ... Continue Reading »
  • Posted on Monday February 13, 2017
    This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2.  This post is the third of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with ... Continue Reading »
  • Posted on Friday February 10, 2017
    Hey, it's HighScalability time: It was a game of drones.If you like this sort of Stuff then please support me on Patreon.Half a trillion: Apple’s cash machine; 4,000-5,000: collected data points per adult in US; 10 million: gallons of gas UPS saves turning right; 2.27: Tesla 0-60 time; 40: complex steps to phone security; ... Continue Reading »
  • Posted on Wednesday February 08, 2017
    This is guest post by Sergei Sheinin, creator of the 2DX Web UI Database Cluster Framework, a low latency big data cluster with in-memory noSQL DBMS Web Browser client. When I began working in the field of data management the disconnect between rigid structure of relational database tables and free form ... Continue Reading »
  • Posted on Monday February 06, 2017
    This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1. Job processing at scale at high concurrency across a distributed infrastructure is a complicated feat. There are many components involvement — servers and controllers to process and monitor jobs, controllers to autoscale ... Continue Reading »
  • Posted on Friday February 03, 2017
    Hey, it's HighScalability time: We live in interesting times. F/A-18 Super Hornets Launch drone swarm.If you like this sort of Stuff then please support me on Patreon.100 billion: words needed to train large networks; 73,653: hard drives at Backblaze; 300 GB hour: raw 4k footage; 1993: server running without rebooting; 64%: of money bet ... Continue Reading »
  • Posted on Thursday February 02, 2017
    This is a guest post by Tony Branson.  Performance, scalability, and HA are often used interchangeably, and any confusion about them can result in unrealistic metrics and deployment delays. It is important to invest your time and understand the differences among these three approaches before you invest your money in ... Continue Reading »
  • Posted on Sunday November 29, 2015
    Amazon S3 is a widely used public cloud storage system. S3 allows an object/file to be up to 5TB which is enough for most applications. The AWS Management Console provides a Web-based interface for users to upload and manage files in S3 buckets. However, uploading a large files that is ... Continue Reading »
  • Posted on Tuesday March 10, 2015
    Retail is one of the most important business domains for data science and data mining applications because of its prolific data and numerous optimization problems such as optimal prices, discounts, recommendations, and stock levels that can be solved using data analysis methods. The rise of omni-channel retail that integrates marketing, ... Continue Reading »
  • Posted on Sunday September 14, 2014
    Hadoop 2 or YARN is the new version of Hadoop. It adds the yarn resource manager in addition to the HDFS and MapReduce components. Hadoop MapReduce is a programming model and software framework for writing applications, which is an open-source variant of MapReduce designed and implemented by Google initially for ... Continue Reading »
  • Posted on Tuesday March 18, 2014
    Benchmarks are important to understand the performance and quantitative and qualitative comparison of different systems. Many analytic frameworks, such as Hive, Impala and Shark, are designed and implemented these years and become fundamental software for processing big data. How to benchmark these big data analytic systems is an interesting problem. The ... Continue Reading »
  • Posted on Tuesday February 04, 2014
    The public cloud storage services like Amazon S3, Google Cloud Storage and Windows Azure Storage replicate the data to ensure high availability. On the other hand, with data being replicated, the storage services exhibits certain data consistency models. Different cloud service providers employ different data consistency models nowadays. In this ... Continue Reading »
  • Posted on Tuesday August 20, 2013
    The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-stream processing is the immediate need in many practical applications. In recent years, this idea got a lot of traction and ... Continue Reading »
  • Posted on Friday July 19, 2013
    Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. You can download the slides from Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. These slides contain the “Numbers everyone should know” which everyone working on systems should be familiar with. Numbers Everyone Should Know L1 cache reference ... Continue Reading »
  • Posted on Wednesday July 17, 2013
    Here is a list of tutorials for learning how to write MapReduce programs on Hadoop, the opensource MapReduce implementation with HDFS. MapReduce Tutorials The official tutorial on Hadoop MapReduce framework: http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html. Yahoo! Hadoop Tutorial A comprehensive tutorial on Hadoop from Yahoo! Developer Network: http://developer.yahoo.com/hadoop/tutorial/. More about MapReduce To better understand the design behind MapReduce, it is ... Continue Reading »
  • Posted on Tuesday January 22, 2013
    Storage Architecture and Challenges in Faculty Summit, July 29, 2010, by Andrew Fikes, Principal Engineer.Download PDF (from archive.org). This slides introduces some of Google’s storage systems with insights and discussion of problems.        » Read more Continue Reading »
  • Posted on Tuesday January 22, 2013
    Designs, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean. Everyone who is interested in large distributed systems should read: PDF for Designs, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean.       » Read more Continue Reading »
  •  
  •  
  •  
  •  
  •  
  •  

One Reply to “Readings on Systems”

  1. Note for blog authors: if you do not want your articles appear here (we just post a excerpt, not the full content), please drop me a message and I will delete them. If you have good suggestions on blogs/sites (with a RSS feed) to add to this list, please also let me know.

Leave a Reply

Your email address will not be published. Required fields are marked *