Comment 33 for bug 1743637

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Okay,

This was a little trick. I could not make openvswitch + dpdk to work in Xenial.. not sure why :\. I've installed pike openvswitch + dpdk (from Cloud Archive) and tested it with Xenial's qemu (from -proposed, from this case) so I could test this logic (apart from final user confirmation):

https://pastebin.ubuntu.com/p/tnKMSWSQRD/

To make sure I was triggering the vhost-user logic I've used systemtap, that didn't work either for latest Xenial kernel I had (4.4.0-125-generic). So I backported systemtap from artful and make it to run a small stap script for qemu-system-x86_64 binary.

https://pastebin.ubuntu.com/p/cYSD9mc4qM/

$ sudo stap -d /lib/x86_64-linux-gnu/libc-2.23.so -d kernel -g ./kvm-arch.stap

# during qemu startup (when vhost_user_start logic is triggered)

 0x5603689d7230 : net_vhost_user_event+0xc0/0x220 [/usr/bin/qemu-system-x86_64]
 0x5603689d7519 : net_init_vhost_user+0x159/0x270 [/usr/bin/qemu-system-x86_64]
 0x5603689d051e : net_client_init+0x14e/0x340 [/usr/bin/qemu-system-x86_64]
 0x5603689d07ac : net_init_netdev+0x2c/0x70 [/usr/bin/qemu-system-x86_64]
 0x560368a9681a : qemu_opts_foreach+0x6a/0xc0 [/usr/bin/qemu-system-x86_64]
 0x5603689d1197 : net_init_clients+0x67/0xe0 [/usr/bin/qemu-system-x86_64]
 0x560368792f46 : main+0xfe6/0x57a0 [/usr/bin/qemu-system-x86_64]
 0x7f57b3077830 : __libc_start_main+0xf0/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x560368798da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]

# full shutdown with the tap_cleanup and vhost_user_cleanup logics:

----
 0x559d0c93f4a0 : tap_cleanup+0x0/0xe0 [/usr/bin/qemu-system-x86_64]
 0x559d0c93a1c5 : qemu_del_net_client+0x1a5/0x210 [/usr/bin/qemu-system-x86_64]
 0x559d0c93afed : net_cleanup+0x1d/0x60 [/usr/bin/qemu-system-x86_64]
 0x7f9b84d80ff8 : __run_exit_handlers+0xe8/0x120 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d81045 : exit+0x15/0x20 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d67837 : __libc_start_main+0xf7/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x559d0c702da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]
----
 0x559d0c940de0 : vhost_user_cleanup+0x0/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c93a1c5 : qemu_del_net_client+0x1a5/0x210 [/usr/bin/qemu-system-x86_64]
 0x559d0c93afed : net_cleanup+0x1d/0x60 [/usr/bin/qemu-system-x86_64]
 0x7f9b84d80ff8 : __run_exit_handlers+0xe8/0x120 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d81045 : exit+0x15/0x20 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d67837 : __libc_start_main+0xf7/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x559d0c702da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]
----
 0x559d0c782780 : vhost_user_cleanup+0x0/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c78161c : vhost_dev_cleanup+0x8c/0xc0 [/usr/bin/qemu-system-x86_64]
 0x559d0c940df5 : vhost_user_cleanup+0x15/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c93a1c5 : qemu_del_net_client+0x1a5/0x210 [/usr/bin/qemu-system-x86_64]
 0x559d0c93afed : net_cleanup+0x1d/0x60 [/usr/bin/qemu-system-x86_64]
 0x7f9b84d80ff8 : __run_exit_handlers+0xe8/0x120 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d81045 : exit+0x15/0x20 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d67837 : __libc_start_main+0xf7/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x559d0c702da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]
----
 0x559d0c940de0 : vhost_user_cleanup+0x0/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c93a1c5 : qemu_del_net_client+0x1a5/0x210 [/usr/bin/qemu-system-x86_64]
 0x559d0c93afed : net_cleanup+0x1d/0x60 [/usr/bin/qemu-system-x86_64]
 0x7f9b84d80ff8 : __run_exit_handlers+0xe8/0x120 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d81045 : exit+0x15/0x20 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d67837 : __libc_start_main+0xf7/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x559d0c702da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]
----
 0x559d0c782780 : vhost_user_cleanup+0x0/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c78161c : vhost_dev_cleanup+0x8c/0xc0 [/usr/bin/qemu-system-x86_64]
 0x559d0c940df5 : vhost_user_cleanup+0x15/0x40 [/usr/bin/qemu-system-x86_64]
 0x559d0c93a1c5 : qemu_del_net_client+0x1a5/0x210 [/usr/bin/qemu-system-x86_64]
 0x559d0c93afed : net_cleanup+0x1d/0x60 [/usr/bin/qemu-system-x86_64]
 0x7f9b84d80ff8 : __run_exit_handlers+0xe8/0x120 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d81045 : exit+0x15/0x20 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x7f9b84d67837 : __libc_start_main+0xf7/0x1d0 [/lib/x86_64-linux-gnu/libc-2.23.so]
 0x559d0c702da9 : _start+0x29/0x30 [/usr/bin/qemu-system-x86_64]

I have tested this start/stop logic tons of times in a loop and could not find any issue so far.

# vhost_user_stop logic is not triggered during shutdown. its triggered only in case of a full vhost-user reset, which would mean that openvswitch has reset. in this case, libvirt would have to reconnect the vhostuser and that functionality does not seem to be present in Xenial. other possibility would be that the guests were configured using vhostuser interface in "server" mode, but I didn't stress that option.

With all that, the cleanup logic seems good and not generating apparent segfaults for vhost-user interfaces. I can't let to document that for any reader of this case, I would strongly recommend using more recent versions of libvirt/qemu (than Xenial) if using ovs with dpdk =).

My very best Regards
Rafael