5 Comments

  1. This article describes a very interesting problem. Could you give more details on 2 parts that I did not understand quite well from the current info:

    1. how was the conclusion drew from the `dmesg` logs?

    2. how does the `x-data-plane` function solve this problem? Some links to the good introduction of the x-data-plane may be good.

  2. Q: how was the conclusion drew from the `dmesg` logs?

    A: The normal way to trigger context switch in Linux Kernel should be like following, which is triggered by its ideal timeslice has been exhausted:


    [] dump_stack+0x64/0x84
    [] __schedule+0x561/0x900
    [] _cond_resched+0x2a/0x40
    [] kvm_arch_vcpu_ioctl_run+0xf54/0x12d0 [kvm]
    [] ? futex_wake+0x80/0x160
    [] ? kvm_arch_vcpu_load+0x4e/0x1b0 [kvm]
    [] kvm_vcpu_ioctl+0x3f7/0x560 [kvm]
    [] ? __dequeue_entity+0x30/0x50
    [] ? __switch_to+0x596/0x690
    [] do_vfs_ioctl+0x93/0x520
    [] ? SyS_futex+0x7d/0x170
    [] SyS_ioctl+0xa1/0xb0
    [] system_call_fastpath+0x1a/0x1f

    The abnormal way is like following (has been shown as above article), which is triggered by lock contention.

    [] dump_stack+0x64/0x84
    [] __schedule+0x561/0x900
    [] schedule+0x29/0x70
    [] futex_wait_queue_me+0xd8/0x150
    [] futex_wait+0x1ab/0x2b0
    [] ? futex_wake+0x80/0x160
    [] ? __vmx_load_host_state+0x125/0x170 [kvm_intel]
    [] do_futex+0xf5/0xd20
    [] ? kvm_vcpu_ioctl+0x100/0x560 [kvm]
    [] ? __dequeue_entity+0x30/0x50
    [] ? do_vfs_ioctl+0x93/0x520
    [] ? native_write_msr_safe+0xa/0x10
    [] SyS_futex+0x7d/0x170
    [] ? fire_user_return_notifiers+0x42/0x50
    [] ? do_notify_resume+0xc5/0x100
    [] system_call_fastpath+0x1a/0x1f

    Q: how does the `x-data-plane` function solve this problem? Some links to the good introduction of the x-data-plane may be good.

    A: If QEMU is without ”x-data-plane” feature, the I/O worker thread in QEMU and QEMU vCPU thread shared one global QEMU lock to keep synchronization in QEMU main loop, which will cause lock contention. QEMU “x-data-plane” creates dedicated I/O thread out of QEMU main loop to handle I/O requests dedicatedly, which doesn’t need to hold QEMU global lock to handle I/O requests.

    Good introduction links:
    [0] http://blog.vmsplice.net/2013/03/new-in-qemu-14-high-performance-virtio.html
    [1] http://events.linuxfoundation.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf
    [2] https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.0_Release_Notes/chap-Red_Hat_Enterprise_Linux-7.0_Release_Notes-Virtualization.html
    [3] https://blueprints.launchpad.net/nova/+spec/add-virtio-data-plane-support-for-qemu

  3. Thanks for the replies!

    [] futex_wait_queue_me+0xd8/0x150
    [] futex_wait+0x1ab/0x2b0

    seems the key sign of the problem.

    The Data-Plane performance from the IBM report is quite amazing: 1.58milliion IOPS for a single VM in 2013.

    1. Right. They are the key signs of this problem. The performance of I/O intensive workload is really improved. For me, with x-data-plane in QEMU, the timeslice of vCPU thread will be almost stable.

Leave a Reply

Your email address will not be published. Required fields are marked *