How does linux kernel collect task stats data

Motivation

Recently, I find it is hard to know the percentage of time that one process uses to wait for synchronous I/O (eg, read, etc). One way is to use the taskstats API provided by Linux Kernel [1]. However, for this way, the precision may be one problem. With this problem, I dig into Linux Kernel source codes to see how “blkio_delay_total” (Delay time waiting for synchronous block I/O to complete) is calculated.

Details

Actually, “blkio_delay_total” is calculated in function “__delayacct_add_tsk” in “linux/kernel/delayacct.c” file as follows.

 83 int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
 84 {
 ...
 121         tmp = d->blkio_delay_total + tsk->delays->blkio_delay;
 122         d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;
 ...
 131 
 132         return 0;
 133
}

From above source codes, we know that “blkio_delay_total” adds “blkio_delay” each time. And I find “blkio_delay” is the time delay which one process is waiting for synchronous I/O each time. It is calculated with following way.

In “linux/kernel/core.c”:

4969 /*
4970  * This task is about to go to sleep on IO. Increment rq->nr_iowait so
4971  * that process accounting knows that this is a task in IO wait state.
4972  */
4973 long __sched io_schedule_timeout(long timeout)
4974 {
4975         int old_iowait = current->in_iowait;
4976         struct rq *rq;
4977         long ret;
4978 
4979         current->in_iowait = 1;
4980         blk_schedule_flush_plug(current);
4981 
4982         delayacct_blkio_start();
4983         rq = raw_rq();
4984         atomic_inc(&rq->nr_iowait);
4985         ret = schedule_timeout(timeout);
4986         current->in_iowait = old_iowait;
4987         atomic_dec(&rq->nr_iowait);
4988         delayacct_blkio_end();
4989 
4990         return ret;
4991 }
4992 EXPORT_SYMBOL(io_schedule_timeout);

When one process starts to wait for I/O, the start time will be recorded. And after it finishes sync I/O, it will get blkio_delay which equals to current time minus start time. At last, add this delta time (blkio_delay) to process’s “blkio_delay_total”.

Conclusion

1, When current process is waiting for synchronous I/O, its blkio_delay will be calculated and added to blkio_delay_total.
2, blkio_delay is updated when current process finishes its sync I/O.

References

[1] https://www.kernel.org/doc/Documentation/accounting/taskstats-struct.txt
[2] http://lxr.free-electrons.com/source/kernel/sched/core.c?v=4.7#L4988

2 comments:

Leave a Reply

Your email address will not be published. Required fields are marked *