Colossus: Successor to the Google File System (GFS)

Colossus is the successor to the Google File System (GFS) as mentioned in the recent paper on Spanner on OSDI 2012. Colossus is also used by spanner to store its tablets. The information about Colossus is slim compared with GFS which is published in the paper on SOSP 2003. There is still some information about Colossus on the Web. Here, I list some of them.

Storage Architecture and Challenges

On Faculty Summit, July 29, 2010, by Andrew Fikes, Principal Engineer.

[[storage-architecture-and-challenges|The slides]]. Some interesting points:

  • Storage Software: Colossus
    • Next-generation cluster-level file system
    • Automatically sharded metadata layer
    • Data typically written using Reed-Solomon (1.5x)
    • Client-driven replication, encoding and replication
    • Metadata space has enabled availability analyses
  • Why Reed-Solomon?
    • Cost. Especially w/ cross cluster replication.
    • Field data and simulations show improved MTTF
    • More flexible cost vs. availability choices

GFS: Evolution on Fast-forward

An interview with Google’s Sean Quinlan by the Association for Computer Machinery (ACM).

View the interview.

Google File System II: Dawn of the Multiplying Master Nodes Comments on GFS2 (colossus)

by Cade Metz in San Francisco.

The article and some excerpt.

Eric Zhiqiang Ma

Eric is interested in building high-performance and scalable distributed systems and related technologies. The views or opinions expressed here are solely Eric's own and do not necessarily represent those of any third parties.

3 comments:

Leave a Reply

Your email address will not be published. Required fields are marked *