Colossus is the successor to the Google File System (GFS) as mentioned in the recent paper on Spanner on OSDI 2012. Colossus is also used by spanner to store its tablets. The information about Colossus is slim compared with GFS which is published in the paper on SOSP 2003. There is still some information about Colossus on the Web. Here, I list some of them.
Storage Architecture and Challenges
On Faculty Summit, July 29, 2010, by Andrew Fikes, Principal Engineer.
[[storage-architecture-and-challenges|The slides]]. Some interesting points:
- Storage Software: Colossus
- Next-generation cluster-level file system
- Automatically sharded metadata layer
- Data typically written using Reed-Solomon (1.5x)
- Client-driven replication, encoding and replication
- Metadata space has enabled availability analyses
- Why Reed-Solomon?
- Cost. Especially w/ cross cluster replication.
- Field data and simulations show improved MTTF
- More flexible cost vs. availability choices
GFS: Evolution on Fast-forward
An interview with Google’s Sean Quinlan by the Association for Computer Machinery (ACM).
Google File System II: Dawn of the Multiplying Master Nodes Comments on GFS2 (colossus)
by Cade Metz in San Francisco.