What is the difference between work conserving I/O scheduler and non-work conserving I/O scheduler?
In a work-conserving mode, the scheduler must choose one of the pending requests, if any, to dispatch, even if the pending requests are far away from the current disk head position. The rationale for non-work-conserving schedulers, such as the anticipatory scheduler (AS) and Completely Fair Queuing (CFQ), is that a request that is soon to arrive might be much closer to the disk head than the currently pending requests, in which case it may be worthwhile to wait for the future request.
In my point view, non-work conserving keeps the locality of synchronous I/O requests so that the system performance will be improved.
If there is only one process with intensive I/O requests, we ever noticed that the deadline scheduling algorithm can almost double the I/O throughput from CFQ. With CFQ, it is likely the process’ threads may be waiting for I/O, the I/O scheduler is waiting for requests closer to the disk head while there won’t be as the waiting threads are the only threads having I/O requests. The waiting here wastes time actually.
Thank you for your comments.
I think it depends on particular workloads but, in general, consecutive synchronous I/O requests from the same task will have locality feature. In your case, you need to describe your workload clearly so that I can understand why. Anyway, any solution/mechanism has its limitations.
Yes, it depends on the workloads. A scheduler may work well for one case while work badly for another.
For the case I experienced, it is roughly as follows. A dir contains many (100K+) files of 1MB~5MB. The process first A) reads around 10s-100 files into memory in parallel by around 4-10 threads each time and then B) processes them. The process repeats (A->B) until all files are processed.
The filesystem is an ext4 and it is on top of a RAID5 disk array. Filesystem type may matter too here as the blocks layout matters. But there was no chance to test more FSes, such as XFS.
For your specific workload, I guess the performance for AS might be affected by following factors.
1, For each process, it might read files which are not stored consecutively on HDD so that AS will be invalid and have a bad impact.
2, It also depends on prefeching mechanisms of OS and hardware.
3, Another factor may depends on how these threads work since one may be blocked after each read so that AS will be invalid and have bad impact.
If we want to learn the detail reasons, we have to think about this problem from the system stack (application <-> OS <-> Hardware) angle. If it is for virtualized systems, things are changed due to double scheduling, VM consolidation, and device virtualization problems.
3 is possibly the most affecting factor for the workloads. As the system has only one process doing intensive I/O and it is natural for programs to be implemented to wait on I/O.
Even though you have one process which creates multiple threads in user space but for Linux Kernel, these threads in kernel will be regarded different threads/tasks (gid is the same). That’s why 1 is also one possible reason.
Agreed. The file writing went through several levels of system components (Windows programs write the files through Samba on Linux through ext4 through a RAID5 card to a disk array with 10+ disks) and the files blocks may be very likely non-consecutive.
Yes, exactly. From my experience, RAID controller will improve I/O performance very much since I/O requests can be handled in parallel via multiple RAID controller channels. Besides, system stack is much more complex in virtualized systems such as KVM, Xen, etc.