qemu-system-x86_64/kvm-spice failed to boot a vm with appmor enabled

Bug #1513367 reported by Xiang Hui
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
libvirt (Ubuntu)
Fix Released
High
Chuck Short

Bug Description

[ENV]
Ubuntu 15.04
Linux 3.19.0-30-generic
libvirtd (libvirt) 1.2.12
QEMU emulator version 2.2.0 (Debian 1:2.2+dfsg-5expubuntu9.7)
qemu-kvm 1:2.2+dfsg-5expubuntu9.7
qemu-system-x86 1:2.2+dfsg-5expubuntu9.7
libvirt-bin 1.2.12-0ubuntu14.3
nova installed from git source stable/liberty
The cloud-archives coming from official vivid.

[Note]
It is not just me who is using 15.04 have this issue, anyone who is trying to use Ubuntu to deploy OVS-DPDK enabled vms are blocking here, ubuntu trusty is also a known issue version as well.
It looks like an apparmor related issue.

[OVS-DISCUSS]
http://openvswitch.org/pipermail/discuss/2015-August/018560.html

This bug is seperate from bug https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1384532 to focus on the error.

I have been trying to redeploy the environment, for this time, the error is not about kvm-spice, it is qemu-system-x86_64 got the same problem. Due to /usr/bin/qemu-system-x86_64 is a bin file, I am not able to put the strace there. And I don't know why suddenly /usr/bin/kvm-spice has been replaced by /usr/bin/qemu-systerm-x86_64.

2015-11-05 15:36:15.491 DEBUG nova.compute.utils [req-b292f304-014b-479f-af5d-38b96309f78f admin admin] [instance: 3dceb341-643d-492a-8a47-8154da341c02] internal error: Process exited prior to exec: libvirt: error : unable to set AppArmor profile 'libvirt-3dceb341-643d-492a-8a47-8154da341c02' for '/usr/bin/qemu-system-x86_64': No such file or directory
 from (pid=12236) notify_about_instance_usage /opt/stack/nova/nova/compute/utils.py:284
2015-11-05 15:36:15.492 DEBUG nova.compute.manager [req-b292f304-014b-479f-af5d-38b96309f78f admin admin] [instance: 3dceb341-643d-492a-8a47-8154da341c02] Build of instance 3dceb341-643d-492a-8a47-8154da341c02 was re-scheduled: internal error: Process exited prior to exec: libvirt: error : unable to set AppArmor profile 'libvirt-3dceb341-643d-492a-8a47-8154da341c02' for '/usr/bin/qemu-system-x86_64': No such file or directory

Let me know what's the next step for further analysis.

Tags: dpdk
Revision history for this message
Xiang Hui (xianghui) wrote :
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

I believe the 'no such file or directory' is what qemu is reporting about some device which openstack is trying to hand it.

Can you confirm that

/dev/hugepages/libvirt/qemu

exists (ls -l /dev/hugepages/libvirt)?

Try the following on your compute node to get strace output:

mv /usr/bin/qemu-system-x86_64 /usr/bin/qemu-system-x86_64.real
cat > /usr/bin/qemu-system-x86_64 << EOF
#!/bin/sh
exec strace -f /usr/bin/qemu-system-x86_64 $*
EOF
chmod ugo+x /usr/bin/qemu-system-x86_64

Hopefully the strace output will show up in the instance-0000001e.log file and tell us which file did not exist.

(After the experiment, please do

mv /usr/bin/qemu-system-x86_64 /usr/bin/qemu-system-x86_64.wrap
mv /usr/bin/qemu-system-x86_64.real /usr/bin/qemu-system-x86_64

to re-set the system to its original state)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Actually it seems reasonably likely that your problem is with:

-smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=12.0.0,serial=e87d7510-5766-e35e-8016-ebeb55d7deff,uuid=3dceb341-643d-492a-8a47-8154da341c02,family=Virtual Machine

because the smbios has spaces in the field values.

Revision history for this message
Xiang Hui (xianghui) wrote :
Download full text (3.8 KiB)

@Serge

AFter use the qemu-system-x86_64 with strace, the error as below:
2015-11-06 15:55:12.681 ERROR nova.compute.manager [req-b2e4d8e4-70d2-40b7-814c-409ae1720729 None None] Error updating resources for node panghua-CS24-TY: internal error: Child process (LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin /usr/bin/qemu-system-x86_64 -help) unexpected exit status 1: execve("/usr/bin/qemu-system-x86_64", ["/usr/bin/qemu-system-x86_64", "-help"], [/* 3 vars */]) = 0
brk(0) = 0x7f6b6d250000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6b6bf56000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=95906, ...}) = 0
mmap(NULL, 95906, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f6b6bf3e000
close(4) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 4
read(4, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\v\2\0\0\0\0\0"..., 832) = 832
fstat(4, {st_mode=

And I spawn a vm by setting security_driver=None in qemu, below stance-00000027.log is the successful one, comparing with instance-0000001e.log, basically only uuid or names different, same configuration of smbios.

2015-11-06 09:29:41.882+0000: starting up
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name instance-00000027 -S -machine pc-i440fx-utopic,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,share=on,size=2048M,id=ram-node0,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid 6c80c9ec-4445-4101-99c9-6339cb2f56a9 -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=12.0.0,serial=e87d7510-5766-e35e-8016-ebeb55d7deff,uuid=6c80c9ec-4445-4101-99c9-6339cb2f56a9,family=Virtual Machine -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000027.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/opt/stack/data/nova/instances/6c80c9ec-4445-4101-99c9-6339cb2f56a9/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/opt/stack/data/nova/instances/6c80c9ec-4445-4101-99c9-6339cb2f56a9/disk.config,if=none,id=drive-ide0-1-1,readonly=on,format=raw,cache=none -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu5392206b-dc -netdev type=vhost-user,id=hostnet0,chardev=charnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:e5:41:f1,bus=pci.0...

Read more...

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Is that really the only strace output you saw?

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Can you show which libvirt version you are using?

Can you show the results of:

ls -l /etc/apparmor.d/libvirt
ls -l /proc /proc/self /proc/self/attr

And then the following manual test:

cd /tmp
cat > testprofile << EOF
        profile i_cant_be_trusted_anymore {
            /etc/ld.so.cache mr,
            /lib/ld-*.so* mrix,
            /lib/libc*.so* mr,
            /usr/bin/head ix,
        }
EOF
cat > aa_change_profile << EOF
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/apparmor.h>

int main()
{
 errno = 0;
 int ret = aa_change_profile("i_cant_be_trusted_anymore");
 printf("aa_change_profile returned %d %d\n", ret, errno);
 ret = system("/bin/bash");
 printf("bash returned %d %d\n", ret, errno);
}
EOF

apparmor_parser /tmp/testprofile
sudo apt-get -y install libapparmor-dev
gcc -o aa_change_profile aa_change_profile.c -lapparmor
sudo ./aa_change_profile
sudo strace -f ./aa_change_profile

Revision history for this message
Xiang Hui (xianghui) wrote :

All the outputs ar euploaded into 1513367-20151107.tar.gz

Revision history for this message
Xiang Hui (xianghui) wrote :
Revision history for this message
Xiang Hui (xianghui) wrote :

@Serge

Hello, is the logs uploaded enough for you? let me know if you need more, thanks.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Chuck,

could you request whatever openstack config info we'd need to reproduce this?

Changed in libvirt (Ubuntu):
importance: Undecided → High
assignee: nobody → Chuck Short (zulcss)
Xiang Hui (xianghui)
tags: added: dpdk
Revision history for this message
Chuck Short (zulcss) wrote :

Can you attach your libvirt xml file for the domain and your nova-compute-logs please?

Thanks
chuck

Revision history for this message
Xiang Hui (xianghui) wrote :

@Chuck

There is no libvirt xml file because the vm is spawned failed finally, and the content in the bug description is the accurate error report from nova-compute node, thanks.

Revision history for this message
sean mooney (sean-k-mooney) wrote :

xianghui if you look int he nova compute log it will contain a full copy of the libvirt xml that it tried to boot the vm with.

Revision history for this message
Xiang Hui (xianghui) wrote :

@sean You are correct, I am uploading the fresh libvirt xml file and part of the nova-compute related logs.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi,

I just tried to reproduce this locally using your xml file, but all seemed fine. So it doesn't seem like there are any problematic filenames.

Ideally, we could stop nova from cleaning up at the failure point, and then at failure for existance of

/usr/bin/kvm-spice

and contents of

/etc/apparmor.d/libvirt/libvirt-$uuid.files

and existance of every disk file listed in the xml file.

To recap - this happens to you always, or only occasionally? Always on the same host?

Can you edit the bug description at top to point out your Ubuntu release, ppas/cloud-archives in use, and libvirt, nova and qemu package versions?

Chuck, do you have any other ideas?

Revision history for this message
Xiang Hui (xianghui) wrote :

@Serge

"this happens to you always, or only occasionally? Always on the same host?"
 - Yes, it always happen, not just me, it happened to any people who is using Ubuntu to deploy OVS-DPDK enabled vms, it's kind of critical.

Discuss email
http://openvswitch.org/pipermail/discuss/2015-August/018560.html

The different when you ty to spawn vms from xml file might be, for OVS-DPDK, we use qemu vhost-user feature and hugepages.

Xiang Hui (xianghui)
description: updated
Revision history for this message
James Page (james-page) wrote :

Two observations after discussing with Hui on IRC:

1) Hugepage filesystem

Right now, the apparmor profile only allows access to:

   # for access to hugepages
  owner "/run/hugepages/kvm/libvirt/qemu/**" rw,

if the hugepage FS is mounted elsewhere, any hugepage access will be blocked by apparmor.

The fact that the rule also specifies a subdirectory may also create problems, but I'm not 100% sure on that (depends on how dpdk shared hugepage memory with the guest device I think).

2) vhost-user device access

The configuration for the vhost-user device created in OVS will also be blocked by apparmor:

  -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu5392206b-dc -netdev type=vhost-user,id=hostnet0,chardev=charnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:e5:41:f1,bus=pci.0,addr=0x3

I'm assuming these will always be located in /var/run/openvswitch - but that's probably a little to generic for an apparmor rule - do they always follow as particular naming convention?

Revision history for this message
James Page (james-page) wrote :

Took a look at /var/run/openvswitch:

-rw-r--r-- 1 root root 6 Dec 9 12:09 ovsdb-server.pid
srwx------ 1 root root 0 Dec 9 12:09 db.sock
srwx------ 1 root root 0 Dec 9 12:09 ovsdb-server.20518.ctl
-rw-r--r-- 1 root root 6 Dec 9 12:09 ovs-vswitchd.pid
srwx------ 1 root root 0 Dec 9 12:09 ovs-vswitchd.20528.ctl
srwx------ 1 root root 0 Dec 9 12:09 br-int.snoop
srwx------ 1 root root 0 Dec 9 12:09 br-int.mgmt
srwx------ 1 root root 0 Dec 9 12:12 br-ex.snoop
srwx------ 1 root root 0 Dec 9 12:12 br-ex.mgmt
srwx------ 1 root root 0 Dec 9 12:12 br-data.snoop
srwx------ 1 root root 0 Dec 9 12:12 br-data.mgmt
srwx------ 1 root root 0 Dec 9 12:12 br-tun.snoop
srwx------ 1 root root 0 Dec 9 12:12 br-tun.mgmt

Definitely don't want to open up access to all of those!

Revision history for this message
James Page (james-page) wrote :

Linking to bug 1524737 - systemd behaves a bit differently with regards to hugepages so we might want to update libvirt's apparmor rules to deal with that

Revision history for this message
James Page (james-page) wrote :

For anyone experiencing this problem - any DENIED or COMPLAIN messages from syslog/kern.log would be highly useful to help generate an update to the apparmor provide for libvirt-qemu.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1513367] Re: qemu-system-x86_64/kvm-spice failed to boot a vm with appmor enabled

Quoting James Page (<email address hidden>):
> 2) vhost-user device access
>
> The configuration for the vhost-user device created in OVS will also be
> blocked by apparmor:
>
> -chardev socket,id=charnet0,path=/var/run/openvswitch/vhu5392206b-dc
> -netdev type=vhost-user,id=hostnet0,chardev=charnet0 -device virtio-net-
> pci,netdev=hostnet0,id=net0,mac=fa:16:3e:e5:41:f1,bus=pci.0,addr=0x3
>
> I'm assuming these will always be located in /var/run/openvswitch - but
> that's probably a little to generic for an apparmor rule - do they
> always follow as particular naming convention?

virt-aa-helper should be providing access for this one, not a blanket
allow rule.

Stefan Bader (smb)
Changed in libvirt (Ubuntu):
status: New → Triaged
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi,

Could someone who can reproduce this problem try adding:

/var/run/** r,

to the file /etc/apparmor.d/usr.lib.libvirt.virt-aa-helper

and see whether that solves the problem?

Revision history for this message
James Page (james-page) wrote :

FWIW I'm testing on Xenial with the latest libvirt packages for Ubuntu; the generated apparmor profile .files file for my instances correctly grants access to /var/run/openvswitch/<vhostusersocket>:

  "/run/openvswitch/vhu8b11d723-35" rw,
  /dev/vhost-net rw,

Remaining problem is that with the default libvirt user/group for qemu processes, the qemu instance can't actually read/write the vhostuser socket - switching to root/root fixes this problem but does result in all qemu processes running as the root user which is less than ideal.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@james-page - that is the problem cpaelzer is working on right? What should be done with this bug? Is there another bug which this can be dup'ed to, or should we we turn this into a bug to track his dpdk/qemu/libvirt work?

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 1.3.1-1ubuntu9

---------------
libvirt (1.3.1-1ubuntu9) xenial; urgency=medium

  * Remove the tasks limit on libvirt-bin service (LP: #1567381)
    This should be un-done when it is properly fixed in the code so
    that virtual machines are started in their own pids cgroup.

 -- Serge Hallyn <email address hidden> Thu, 07 Apr 2016 10:05:01 -0500

Changed in libvirt (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.