How sched_setaffinity works inside of Linux Kernel

Abstract
Sometimes, we may want to migrate one process/thread to one specific CPU for some specific purpose. In the Unix/Linux systems, you may choose sched_setaffinity to finish this job. This article will help you to understand how sched_setaffinity (or other APIs like pthread_setaffinity_np in user-space) works internal Linux kernel.

Details

-- sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
--- __set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask, bool check)
---- stop_one_cpu(unsigned int cpu, cpu_stop_fn_t fn, void *arg)
----- migration_cpu_stop(void *data)
------ __migrate_task(struct rq *rq, struct task_struct *p, int dest_cpu)
------- move_queued_task(struct rq *rq, struct task_struct *p, int new_cpu)
-------- enqueue_task(struct rq *rq, struct task_struct *p, int flags)
--------- returns the new run queue of destination CPU

Above character steps give a workflow of how sched_setaffinity works (how it migrates one process/thread from the run queue of source CPU to the run queue of destination CPU). Let’s analyze them in details (Note that this article is discussing about how sched_setaffinity works inside the Linux Kernel 4.7.4 and other versions may have a little differences).

Obviously, we can find that sched_setaffinity is a system call in Linux System. Sched_setaffinity gets the process ID we want to migrate and the destination CPU mask bits we want to migrate to. Then, it calls set_cpus_allowed_ptr to do some checking works before migration. In set_cpus_allowed_ptr, it changes the affinity of the process/thread and then calls the most important function “stop_one_cpu” to do the migration. However, before this, it checks whether the process/thread is running (or is going to run, TASK_WAKING). If true, the stop_one_cpu triggers. If not true, check whether the process/thread is on the run queue of source CPU, if true, the CPU which executes sched_setaffinity just migrates the process/thread from the run queue of source CPU to the run queue of destination CPU directly.

In stop_one_cpu, it invokes migration_cpu_stop on the CPU of the process/thread we want to migrate with high priority. In migration_cpu_stop, it calls __migrate_task to test whether the affinity of the process/thread has been changed correctly previously and then, it calls move_queued_task to move the process/thread from the old run queue. At last, in move_queued_task function, it calls enqueue_task to move the process/thread to the new CPU’s run queue. Up to here, stop_one_cpu function returns and the migration is done.

Conclusion
In a word, the sched_setaffinity does following jobs internal Linux Kernel.
1, Check the status of migrated process.
2, If it is in the running/task_waking status, let the source CPU of this process/thread to do migration.
3, If it is in the run queue of source CPU, let the CPU (executes sched_setaffinity system call) to do migration.
4, If it is in the waiting queue, only change the affinity of the source CPU.
5, The migration is to move the process/thread from old run queue to the new run queue of destionation CPU.

References:
1, Linux Kernel source codes – https://www.kernel.org/
2, http://lxr.free-electrons.com/source/kernel/sched/core.c?v=4.7#L4706

Weiwei Jia

Weiwei Jia is a Ph.D. student in the Department of Computer Science at New Jersey Institute of Technology since 2016. His research interests are include storage systems, operating systems and computer systems.