What is Microsoft’s Cosmos service? by Yaron Y. Goland.
Cosmos Big Data and Big Challenges by Pat Helland.
What Is COSMOS?
- Petabyte Store and Computation System
- About 62 physical petabytes stored (~275 logical petabytes stored)
- Tens of thousands of computers across many datacenters
- Massively parallel processing based on Dryad
- Similar to MapReduce but can represent arbitrary DAGs of computation
- Automatic computation placement with data
- SCOPE (Structured Computation Optimized for Parallel Execution)
- SQL-like language with set-oriented record and column manipulation
- Automatically compiled and optimized for execution over Dryad
- Management of hundreds of “Virtual Clusters” for computation allocation
- Buy your machines and give them to COSMOS
- Guaranteed that many compute resources
- May use more when they are not in use
- Ubiquitous access to OSD’s data
- Combining knowledge from different datasets is today’s secret sauce