How sched_setaffinity works inside of Linux Kernel

Abstract Sometimes, we may want to migrate one process/thread to one specific CPU for some specific purpose. In the Unix/Linux systems, you may choose sched_setaffinity to finish this job. This article will help you to understand how sched_setaffinity (or other APIs like pthread_setaffinity_np in user-space) works internal Linux kernel. Details -- sched_setaffinity(pid_t pid, const struct cpumask *in_mask) --- __set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask, bool check) ---- stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg) ----- migration_cpu_stop(void *data) ------ __migrate_task(struct rq *rq, struct task_struct *p, int dest_cpu) ------- move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu) -------- enqueue_task(struct rq *rq, struct task_struct *p, int flags) --------- returns the new run queue of destination CPU Above character steps give a workflow of how sched_setaffinity works (how it migrates one process/thread from the run queue of source CPU to the run queue of destination CPU). » Read more

Big Data Benchmark from AMPLab of UC Berkeley

Benchmarks are important to understand the performance and quantitative and qualitative comparison of different systems. Many analytic frameworks, such as Hive, Impala and Shark, are designed and implemented these years and become fundamental software for processing big data. How to benchmark these big data analytic systems is an interesting problem. The Big Data Benchmark ∞ The Big Data Benchmark from AMPLab, UC Berkeley provides quantitative and qualitative comparisons of five systems by the time this post is written: Redshift – a hosted MPP database offered by Amazon.com based on the ParAccel data warehouse, Hive – a Hadoop-based data warehousing system, Shark – a Hive-compatible SQL engine which runs on top of the Spark computing framework, Impala – a Hive-compatible* SQL engine with its own MPP-like execution engine and Stinger/Tez – Tez is a next generation Hadoop execution engine currently in development. » Read more

Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage

The public cloud storage services like Amazon S3, Google Cloud Storage and Windows Azure Storage replicate the data to ensure high availability. On the other hand, with data being replicated, the storage services exhibits certain data consistency models. Different cloud service providers employ different data consistency models nowadays. In this post, we survey the data consistency models provided by the solutions from the three big players: Amazon S3 and DynamoDB, Google Cloud Storage and Windows Azure Storage. » Read more

PUMA: A MapReduce Benchmark Suite

MapReduce is a well-known programming model designed for generating and processing large data. There are various MapReduce implementations. One widely known and used one may be Hadoop. Benchmarking MapReduce frameworks gets to be important. Faraz Ahmad et al. developed a benchmark suite: PUMA MapReduce Benchmark. During our work on MapReduce, we developed a benchmark suite which represents a broad range of MapReduce applications exhibiting application characteristics with high/low computation and high/low shuffle volumes. » Read more

Hadoop TeraSort Benchmark

TeraSort is one of Hadoop’s widely used benchmarks. Hadoop’s distribution contains both the input generator and sorting implementations: the TeraGen generates the input and TeraSort conducts the sorting. Here, we provide a short tutorial for using the Hadoop TeraSort benchmark. TeraGen generates random data that can be used as input data for a subsequent running of TeraSort. Generate input by TeraGen The syntax for TeraGen: $ hadoop jar hadoop-*examples*.jar teragen <number of 100-byte rows> <output dir> To make the TeraGen run on multiple nodes with multiple tasks, you may need to specify the number of map tasks (30 here as an example; for Hadoop 2): $ hadoop -D mapreduce.job.maps 30 jar hadoop-*examples*.jar teragen <number of 100-byte rows> <output dir> The number of mappers depends on the number of rows you will generate and the number of nodes you have. » Read more

Large-scale Data Storage and Processing System in Datacenters

Research on Cloud Computing has made big progresses and many excellent large-scale systems have been designed in recent years. I compiled a list of some large-scale data storage and processing systems in datacenters as follows. Storage systems ∞Google File System (GFS): http://research.google.com/archive/gfs.html HDFS implementation: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html Colossus (GFS2): Colossus: Successor to the Google File System (GFS) BigTable: http://research.google.com/archive/bigtable.html Megastore: http://research.google.com/pubs/pub36971.html Spanner: http://research.google.com/archive/spanner.html Dynamo: http://dl.acm.org/citation.cfm?id=1294281RAMCloud: http://dl.acm.org/citation.cfm?id=1965751 and http://dl.acm.org/citation.cfm?id=2043560Compute systems ∞MapReduce: http://research.google.com/archive/mapreduce.html Hadoop implementation: Hadoop MapReduce Tutorials Sawzall: http://research.google.com/archive/sawzall.html FlumeJava: http://dl.acm.org/citation.cfm?id=1806638 Pig latin: http://dl.acm.org/citation.cfm?id=1376726 Dryad/DryadLINQ: http://research.microsoft.com/en-us/projects/dryad/ Pregel: http://dl.acm.org/citation.cfm?id=1807184 and http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html Dremel: http://research.google.com/pubs/pub36632.html Storm: https://blog.twitter.com/2011/a-storm-is-coming-more-details-and-plans-for-release and https://github.com/nathanmarz/storm/wiki Spark: https://www.usenix.org/conference/nsdi12/resilient-distributed-datasets-fault-tolerant-abstraction-memory-cluster-computing and http://spark-project.org/DVM: IEEE Transactions on Computers paper and VEE paperResource management ∞Mesos: http://mesos.apache.org/documentation/latest/architecture/ » Read more

Microsofts Cosmos Service

Cosmos is “Microsoft’s internal data storage/query system for analyzing enormous amounts (as in petabytes) of data”. There is no paper/technical report about Cosmos published yet. I compiled a list of information about Cosmos on the Web as follows. What is Microsoft’s Cosmos service? by Yaron Y. Goland. Microsoft Cosmos: Petabytes perfectly processed perfunctorily by Seth Eliot. Cosmos Big Data and Big Challenges by Pat Helland. » Read more

Colossus: Successor to the Google File System (GFS)

Colossus is the successor to the Google File System (GFS) as mentioned in the recent paper on Spanner on OSDI 2012. Colossus is also used by spanner to store its tablets. The information about Colossus is slim compared with GFS which is published in the paper on SOSP 2003. There is still some information about Colossus on the Web. Here, I list some of them. » Read more

Reading List for Distributed Systems and Cloud Computing

Understanding the literature is usually the first step to do research, which is the same for systems research on cloud computing. A reading list may help a lot to those that just start in cloud computing research. Prof. Lin Gu, my PhD supervisor, compiled a reading list for system research on cloud computing. The reading list includes a list of papers related to Internet-scale systems and datacenters, techniques in distributed computing like Paxos, execution frameworks like MapReduce, distributed file systems like GFS, and storage systems like Dynamo. » Read more

Conferences on Cloud Computing 2013

This post lists important conferences related to Cloud Computing in year 2013. SOSP 2013 SOSP’13: The 24th ACM Symposium on Operating Systems Principles. November 3-6, 2013, Nemacolin Woodlands Resort, Pennsylvania.The biennial ACM Symposium on Operating Systems Principles is the world’s premier forum for researchers, developers, programmers, and teachers of computer systems technology. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer systems software. » Read more

Conferences on Cloud Computing 2012

This post lists important conferences on Cloud Computing in year 2012. OSDI 2012 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’12) October 8–10, 2012, Hollywood, CA “The tenth OSDI seeks to present innovative, exciting research in computer systems. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software.” Important Dates Complete paper submissions: Thursday, May 3, 2012, 9:00 p.m. » Read more