[Ubuntu 18.04] [libvirt] virsh restore fails from state file saved in /var/tmp folder using virsh save

Bug #1719579 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
High
Unassigned
apparmor (Ubuntu)
Invalid
High
Unassigned
libvirt (Ubuntu)
Fix Released
High
Unassigned

Bug Description

== Comment: #1 - SEETEENA THOUFEEK <email address hidden> - 2017-01-17 00:09:16 ==
Bala, Please mail me the machine information.

== Comment: #3 - SEETEENA THOUFEEK <email address hidden> - 2017-01-17 02:14:06 ==
2017-01-16 12:09:37.707+0000: 7024: info : virSecurityDACRestoreFileLabelInternal:388 : Restoring DAC user and group on '/var/tmp/bala'
2017-01-16 12:09:37.707+0000: 7024: info : virSecurityDACSetOwnershipInternal:290 : Setting DAC user and group on '/var/tmp/bala' to '0:0'
2017-01-16 12:09:37.707+0000: 7024: warning : qemuDomainSaveImageStartVM:6750 : failed to restore save state label on /var/tmp/bala
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff4ca62b00
2017-01-16 12:09:37.707+0000: 7024: debug : qemuDomainObjEndAsyncJob:1848 : Stopping async job: start (vm=0x3fff4ca535c0 name=virt-tests-vm1-bala)
2017-01-16 12:09:37.707+0000: 7024: info : virObjectRef:296 : OBJECT_REF: obj=0x3fff4ca62b00
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff4ca62b00
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff4ca535c0
2017-01-16 12:09:37.707+0000: 7024: debug : virThreadJobClear:121 : Thread 7024 (virNetServerHandleJob) finished job remoteDispatchDomainRestore with ret=-1
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff7c002c10
2017-01-16 12:09:37.707+0000: 7024: debug : virNetServerProgramSendError:153 : prog=536903814 ver=1 proc=54 type=1 serial=4 msg=0x100133d2590 rerr=0x3fffa59be3c0
2017-01-16 12:09:37.707+0000: 7024: debug : virNetMessageEncodePayload:376 : Encode length as 172
2017-01-16 12:09:37.707+0000: 7024: debug : virNetServerClientSendMessageLocked:1399 : msg=0x100133d2590 proc=54 len=172 offset=0
2017-01-16 12:09:37.707+0000: 7024: info : virNetServerClientSendMessageLocked:1407 : RPC_SERVER_CLIENT_MSG_TX_QUEUE: client=0x100133d23c0 len=172 prog=536903814 vers=1 proc=54 type=1 status=1 serial=4
2017-01-16 12:09:37.707+0000: 7024: debug : virNetServerClientCalculateHandleMode:157 : tls=(nil) hs=-1, rx=0x100133d0670 tx=0x100133d2590
2017-01-16 12:09:37.707+0000: 7024: debug : virNetServerClientCalculateHandleMode:192 : mode=3
2017-01-16 12:09:37.707+0000: 7024: info : virEventPollUpdateHandle:152 : EVENT_POLL_UPDATE_HANDLE: watch=417 events=3
2017-01-16 12:09:37.707+0000: 7024: debug : virEventPollInterruptLocked:727 : Interrupting
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff7c002c10
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x100133caea0
2017-01-16 12:09:37.707+0000: 7024: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x100133d23c0
.
2017-01-16 12:14:28.445+0000: 7019: info : qemuMonitorJSONIOProcessLine:201 : QEMU_MONITOR_RECV_EVENT: mon=0x3fff94004d90 event={"timestamp": {"seconds": 1484568868, "microseconds": 444620}, "event": "MIGRATION", "data": {"status": "failed"}}
2017-01-16 12:14:28.445+0000: 7019: debug : qemuMonitorJSONIOProcessEvent:147 : mon=0x3fff94004d90 obj=0x100133b5670
2017-01-16 12:14:28.445+0000: 7019: debug : virJSONValueToString:1762 : object=0x100133a8000
2017-01-16 12:14:28.445+0000: 7019: debug : virJSONValueToStringOne:1691 : object=0x100133a8000 type=0 gen=0x100133d1160
2017-01-16 12:14:28.445+0000: 7019: debug : virJSONValueToStringOne:1691 : object=0x100133d2a80 type=2 gen=0x100133d1160
2017-01-16 12:14:28.445+0000: 7019: debug : virJSONValueToString:1795 : result={"status":"failed"}
2017-01-16 12:14:28.445+0000: 7019: debug : qemuMonitorEmitEvent:1218 : mon=0x3fff94004d90 event=MIGRATION
2017-01-16 12:14:28.445+0000: 7019: info : virObjectRef:296 : OBJECT_REF: obj=0x3fff94004d90
2017-01-16 12:14:28.445+0000: 7019: debug : qemuProcessHandleEvent:629 : vm=0x3fff4ca535c0
2017-01-16 12:14:28.445+0000: 7019: info : virObjectNew:202 : OBJECT_NEW: obj=0x100133d2870 classname=virDomainQemuMonitorEvent
2017-01-16 12:14:28.445+0000: 7019: debug : virObjectEventNew:645 : obj=0x100133d2870
2017-01-16 12:14:28.445+0000: 7019: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x100133d2870
2017-01-16 12:14:28.445+0000: 7019: info : virObjectUnref:261 : OBJECT_DISPOSE: obj=0x100133d2870
2017-01-16 12:14:28.445+0000: 7019: debug : virDomainQemuMonitorEventDispose:477 : obj=0x100133d2870
2017-01-16 12:14:28.445+0000: 7019: debug : virObjectEventDispose:121 : obj=0x100133d2870
2017-01-16 12:14:28.445+0000: 7019: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff94004d90
2017-01-16 12:14:28.445+0000: 7019: debug : qemuMonitorJSONIOProcessEvent:172 : handle MIGRATION handler=0x3fff9d7247e0 data=0x100133a8000
2017-01-16 12:14:28.445+0000: 7019: debug : qemuMonitorEmitMigrationStatus:1488 : mon=0x3fff94004d90, status=failed
2017-01-16 12:14:28.445+0000: 7019: info : virObjectRef:296 : OBJECT_REF: obj=0x3fff94004d90
2017-01-16 12:14:28.445+0000: 7019: debug : qemuProcessHandleMigrationStatus:1502 : Migration of domain 0x3fff4ca535c0 virt-tests-vm1-bala changed state to failed
2017-01-16 12:14:28.445+0000: 7019: info : virObjectUnref:259 : OBJECT_UNREF: obj=0x3fff94004d90
2017-01-16 12:14:28.445+0000: 7019: debug : qemuMonitorJSONIOProcess:255 : Total used 232 bytes out of 232 available in buffer
2017-01-16 12:14:28.445+0000: 7019: info : virEventPollUpdateHandle:152 : EVENT_POLL_UPDATE_HANDLE: watch=430 events=13
2017-01-16 12:14:28.445+0000: 7023: error : qemuMigrationCheckJobStatus:2641 : operation failed: job: unexpectedly failed

this is an apparmor issue and there is no libvirt bug here.

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-150626 severity-high targetmilestone-inin16043
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → apparmor (Ubuntu)
Revision history for this message
Frank Heimes (fheimes) wrote :

The title says 16.04.2 but the ticket is tagged as 16.04.3.
Did you saw this happened on .2 or .3?
In case you are on .2 please do an upgrade to .3 and try again on the latest level.

no longer affects: apparmor
Changed in ubuntu-power-systems:
importance: Undecided → High
assignee: nobody → Canonical Server Team (canonical-server)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I like "this is an apparmor issue and there is no libvirt bug here" :-)
But if so - or even if not - please provide the dmesg with the related apparmor denies so that we can sort out what happens.

Assuming it is apparmor as reported assigning john johanssen instead of me, but I subscribe myself it it comes back to libvirt to fix it.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

@Frank I- I'm not allowed to change this, could you please re-assign.

Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
assignee: Canonical Server Team (canonical-server) → John Johansen (jjohansen)
Revision history for this message
Seth Arnold (seth-arnold) wrote :

Someone may have to actually describe what's going on...

Revision history for this message
bugproxy (bugproxy) wrote : Sosreport

------- Comment (attachment only) From <email address hidden> 2017-09-27 05:34 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-09-28 04:07 EDT-------
(In reply to comment #38)
> Created attachment 121204 [details]
> Sosreport

Portion of audit log:

4159727.044528] audit: type=1400 audit(1506504293.943:82): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-ee1b97ba-b85c-46ae-97a1-1b45afe8266c" pid=26937 comm="apparmor_parser"
[4159727.059543] audit: type=1400 audit(1506504293.959:83): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-ee1b97ba-b85c-46ae-97a1-1b45afe8266c//qemu_bridge_helper" pid=26937 comm="apparmor_parser"
[4159727.322330] KVM guest htab at c000003cbc000000 (order 28), LPID 2
[4159727.344133] audit: type=1400 audit(1506504294.243:84): apparmor="DENIED" operation="open" profile="libvirt-ee1b97ba-b85c-46ae-97a1-1b45afe8266c" name="/etc/gss/mech.d/" pid=26939 comm="qemu-system-ppc" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0

Attaching sosreport with this Bugzilla.

Revision history for this message
Christian Boltz (cboltz) wrote : Re: [Ubuntu 16.04.2] [libvirt] virsh restore fails from state file saved in /var/tmp folder using virsh save

You'll need to allow
    /etc/gss/mech.d/ r,

and after that, I wouldn't be surprised if you get denials for files inside this directory ;-)

Manoj Iyer (manjo)
tags: added: triage-g
Revision history for this message
Manoj Iyer (manjo) wrote :

Please restest with 16.04.3 and also consider the setup proposed in the comment #7

Changed in apparmor (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Changed in ubuntu-power-systems:
status: New → Incomplete
Manoj Iyer (manjo)
Changed in apparmor (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → John Johansen (jjohansen)
Changed in ubuntu-power-systems:
assignee: John Johansen (jjohansen) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-11-24 06:23 EDT-------
(In reply to comment #43)
> Please restest with 16.04.3 and also consider the setup proposed in the
> comment #7

Tried giving permission to /etc/gss/mech.d and still issue got observed,
# ls -l /etc/gss/
total 4
drwxrwxrwx 2 root root 4096 Jul 19 17:07 mech.d

Tried giving permission to the save.file in /var/tmp/save.file and issue is observed
# ls -l /var/tmp/save.file
-rwxrwxrwx 1 root root 1595918108 Nov 23 10:46 /var/tmp/save.file

Tried disabling apparmor and could see the issue,
# service apparmor stop
# service apparmor status
? apparmor.service - LSB: AppArmor initialization
Loaded: loaded (/etc/init.d/apparmor; bad; vendor preset: enabled)
Active: inactive (dead) since Fri 2017-11-24 16:50:14 IST; 4s ago
Docs: man:systemd-sysv-generator(8)
Process: 13213 ExecStop=/etc/init.d/apparmor stop (code=exited, status=0/SUCCESS)
Process: 12081 ExecStart=/etc/init.d/apparmor start (code=exited, status=0/SUCCESS)

Nov 24 16:50:14 pkvmhab006 systemd[1]: Stopping LSB: AppArmor initialization...
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: * Clearing AppArmor profiles cache
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: ...done.
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: All profile caches have been cleared, but no profiles have been unloaded.
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: Unloading profiles will leave already running processes permanently
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: unconfined, which can lead to unexpected situations.
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: To set a process to complain mode, use the command line tool
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: 'aa-complain'. To really tear down all profiles, run the init script
Nov 24 16:50:14 pkvmhab006 apparmor[13213]: with the 'teardown' option."
Nov 24 16:50:14 pkvmhab006 systemd[1]: Stopped LSB: AppArmor initialization.

# virsh restore /var/tmp/save.file
error: Failed to restore domain from /var/tmp/save.file
error: operation failed: job: unexpectedly failed

Attached: Sosreport

Revision history for this message
bugproxy (bugproxy) wrote : sosreport

------- Comment (attachment only) From <email address hidden> 2017-11-24 06:24 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-11-29 01:48 EDT-------
Not sure how I missed this earlier.

There is an apparmor rule in /etc/apparmor.d/abstractions/libvirt-qemu denying access to /tmp and /var/tmp. More details here, https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648

With Save, libvirt writes to the save file using iohelper which is allowed as qemu is not directly accessing the /var/tmp but restore is from the file FD directly.

So, restore with --bypass-cache should work here which uses the iohelper.

I noticed on 16.04, 17.04 & 17.10 the -bypass-cache also fails with
error: internal error: Child process (LIBVIRT_LOG_OUTPUTS=1:stderr /usr/lib/libvirt/libvirt_iohelper /var/tmp/save.file 0 0) unexpected exit status 1: /usr/lib/libvirt/libvirt_iohelper: Unable to read /var/tmp/save.file: Invalid argument

I see the above is fixed upstream with the commits
776b9ac594b6a1e4afc924826c6e9cb5474e8e27
f830e371ef298e7fa949165d10dcf0cf3518abd5
3b8a0f6ac23e4d4620218870030a08f75419b5d7
05021e727d80527c4b53debed98b87b565780a16
633b699bfda06d9fcdb7f9466e2d2c9b4bc3e63c

Thanks,
Shivaprasad

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-12-05 07:21 EDT-------
(In reply to comment #48)
> Not sure how I missed this earlier.
>
> There is an apparmor rule in /etc/apparmor.d/abstractions/libvirt-qemu
> denying access to /tmp and /var/tmp. More details here,
> https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648
>
> With Save, libvirt writes to the save file using iohelper which is allowed
> as qemu is not directly accessing the /var/tmp but restore is from the file
> FD directly.
>
> So, restore with --bypass-cache should work here which uses the iohelper.
>
> I noticed on 16.04, 17.04 & 17.10 the -bypass-cache also fails with
> error: internal error: Child process (LIBVIRT_LOG_OUTPUTS=1:stderr
> /usr/lib/libvirt/libvirt_iohelper /var/tmp/save.file 0 0) unexpected exit
> status 1: /usr/lib/libvirt/libvirt_iohelper: Unable to read
> /var/tmp/save.file: Invalid argument
>
> I see the above is fixed upstream with the commits
> 776b9ac594b6a1e4afc924826c6e9cb5474e8e27
> f830e371ef298e7fa949165d10dcf0cf3518abd5
> 3b8a0f6ac23e4d4620218870030a08f75419b5d7
> 05021e727d80527c4b53debed98b87b565780a16
> 633b699bfda06d9fcdb7f9466e2d2c9b4bc3e63c
>
> Thanks,
> Shivaprasad

Canonical, Could you please cherry pick the above commits ...

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Ubuntu 16.04.2] [libvirt] virsh restore fails from state file saved in /var/tmp folder using virsh save

Above commits are in libvirt v3.10.

Changed in ubuntu-power-systems:
status: Incomplete → Confirmed
assignee: Canonical Kernel Team (canonical-kernel-team) → Ubuntu Server Team (ubuntu-server)
Changed in apparmor (Ubuntu):
status: Incomplete → Invalid
Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → Low
Changed in apparmor (Ubuntu):
assignee: John Johansen (jjohansen) → nobody
tags: added: libvirt-18.04
Manoj Iyer (manjo)
Changed in libvirt (Ubuntu):
assignee: nobody → David Britton (davidpbritton)
importance: Low → High
Changed in ubuntu-power-systems:
assignee: Ubuntu Server Team (ubuntu-server) → David Britton (davidpbritton)
status: Confirmed → Triaged
Changed in libvirt (Ubuntu):
assignee: David Britton (davidpbritton) → ChristianEhrhardt (paelzer)
Revision history for this message
Manoj Iyer (manjo) wrote :

ChristianEhrhardt, any plans to backport these fixes to 16.04? Or is this going to be an 18.04 feature?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Manoj,
my intention was (unless discussed and outlined to be important to do it otherwise):

18.04 - integrate the upstream changes so that --bypass-cache works

16.04 users using cloud-archive queens will have the same feature

16.04-17.10 and anyone not using bypass-cache will stay as-is I had not planned to backport the code so far.
    I haven't checked yet, but assuming the time delta the backport could cause more
    issues than it solves.
    For the older releases I'd consider it a configuration issue.
    If an admin chooses to use an uncommon path (like /var/tmp) then he has to make
    it allowed to be accessed in the apparmor profiles.

Get me right I use virsh save/restore on a regular base and it works in the paths that are open by default, which are the places the images usually are from like /var/lib/libvirt/images/.

If that does not work that might be a modified apparmor rule, but for that I'd need to know way more about the case and see if it is actually a bug or really just using an uncommon dir.

If you want to look into potential config issues, remove the silent denies to /var and /var temp at the end of "/etc/apparmor.d/abstractions/libvirt-qemu".
Then run your case again, report back with
a) commands to trigger the issue
b) dmesg while that occurred

None of the latter is needed to get it fixed in 18.04 where I'll make bypass-cache work as suggested by IBM by picking the code changes.

Manoj Iyer (manjo)
summary: - [Ubuntu 16.04.2] [libvirt] virsh restore fails from state file saved in
+ [Ubuntu 18.04] [libvirt] virsh restore fails from state file saved in
/var/tmp folder using virsh save
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (13.5 KiB)

This bug was fixed in the package libvirt - 4.0.0-1ubuntu1

---------------
libvirt (4.0.0-1ubuntu1) bionic; urgency=medium

  * Merged with Debian unstable (4.0)
    This closes several bugs:
    - Error generating apparmor profile when hostname contains spaces
      (LP: #799997)
    - qemu 2.10 locks files, libvirt shared now sets share-rw=on (LP: #1716028)
    - libvirt usb passthrough throws apparmor denials related to
      /run/udev/data/+usb (LP: #1727311)
    - AppArmor denies access to /sys/block/*/queue/max_segments (LP: #1729626)
    - iohelper improvements to let bypass-cache work without opening up the
      apparmor isolation (LP: #1719579)
    - nodeinfo on s390x to contain more CPU info (LP: #1733688)
    - Upgrade libvirt >= 4.0 (LP: #1745934)
  * Remaining changes:
    - Disable libssh2 support (universe dependency)
    - Disable firewalld support (universe dependency)
    - Disable selinux
    - Set qemu-group to kvm (for compat with older ubuntu)
    - Additional apport package-hook
    - Modifications to adapt for our delayed switch away from libvirt-bin (can
      be dropped >18.04).
      + d/p/ubuntu/libvirtd-service-add-bin-alias.patch: systemd: define alias
        to old service name so that old references work
      + d/p/ubuntu/libvirtd-init-add-bin-alias.patch: sysv init: define alias
        to old service name so that old references work
      + d/control: transitional package with the old name and maintainer
        scripts to handle the transition
    - Backwards compatible handling of group rename (can be dropped >18.04).
    - config details and autostart of default bridged network. Creating that is
      now the default in general, yet our solution provides the following on
      top as of today:
      + autostart the default network by default
      + do not autostart if subnet is already taken (e.g. in guests).
    - d/p/ubuntu/Allow-libvirt-group-to-access-the-socket.patch: This is
      the group based access to libvirt functions as it was used in Ubuntu
      for quite long.
      + d/p/ubuntu/daemon-augeas-fix-expected.patch fix some related tests
        due to the group access change.
    - ubuntu/parallel-shutdown.patch: set parallel shutdown by default.
    - d/p/ubuntu/enable-kvm-spice.patch: compat with older Ubuntu qemu/kvm
      which provided a separate kvm-spice.
    - d/p/ubuntu/ubuntu-libxl-qemu-path.patch: this change was split. The
      section that adapts the path of the emulator to the Debian/Ubuntu
      packaging is kept.
    - d/p/ubuntu/ubuntu-libxl-Fix-up-VRAM-to-minimum-requirements.patch: auto
      set VRAM to minimum requirements
    - d/p/ubuntu/xen-default-uri.patch: set default URI on xen hosts
    - Add libxl log directory
    - libvirt-uri.sh: Automatically switch default libvirt URI for users on
      Xen dom0 via user profile (was missing on changelogs before)
    - d/p/ubuntu/apibuild-skip-libvirt-common.h: drop libvirt-common.h from
      included_files to avoid build failures due to duplicate definitions.
    - Update README.Debian with Ubuntu changes
    - Convert libvirt0, libnss_libvirt and libvirt-dev to multi-arch.
    - Enable some additional features on ppc...

Changed in libvirt (Ubuntu):
status: Triaged → Fix Released
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Triaged → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (4.3 KiB)

------- Comment From <email address hidden> 2018-02-19 03:35 EDT-------
> Get me right I use virsh save/restore on a regular base and it works in the paths that are open by default, > which are the places the images usually are from like /var/lib/libvirt/images/.

> If that does not work that might be a modified apparmor rule, but for that I'd need to know way more
> about the case and see if it is actually a bug or really just using an uncommon dir.

Even with uncommon dir, the denial should be consistent if the path used by the user is not permitted then apparmor should block/deny when virsh save is performed and not during the virsh restore.

Observation in Ubuntu 16.04.3,

# virsh save virt-tests-vm1 /var/tmp/virt-tests-vm1.save

Domain virt-tests-vm1 saved to /var/tmp/virt-tests-vm1.save

By default virsh restore fails with same error,
# virsh restore /var/tmp/virt-tests-vm1.save
error: Failed to restore domain from /var/tmp/virt-tests-vm1.save
error: operation failed: job: unexpectedly failed

But as suggested by paelzer,
> If you want to look into potential config issues, remove the silent denies to /var and /var temp
> at the end of "/etc/apparmor.d/abstractions/libvirt-qemu".
> Then run your case again, report back with

commenting denials,

# silence spurious denials (see lp#1403648)
deny /tmp/{,**} r,
# deny /var/tmp/{,**} r,

restart libvirtd

# virsh restore /var/tmp/virt-tests-vm1.save
error: Failed to restore domain from /var/tmp/virt-tests-vm1.save
error: internal error: Process exited prior to exec: libvirt: error : unable to set AppArmor profile 'libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442' for '/usr/bin/kvm': No such file or directory

But file exists,
# file /var/tmp/virt-tests-vm1.save
/var/tmp/virt-tests-vm1.save: Libvirt QEMU Suspend Image, version 2, XML length 1970, running

dmesg:
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered blocking state
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered disabled state
[Mon Feb 19 03:19:16 2018] device vnet0 entered promiscuous mode
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered blocking state
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered listening state
[Mon Feb 19 03:19:16 2018] audit: type=1400 audit(1519028363.683:12417): apparmor="DENIED" operation="change_profile" info="label not found" error=-2 profile="/usr/sbin/libvirtd" name="libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442" pid=12949 comm="libvirtd"
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered disabled state
[Mon Feb 19 03:19:16 2018] device vnet0 left promiscuous mode
[Mon Feb 19 03:19:16 2018] virbr0: port 2(vnet0) entered disabled state

Attaching full dmesg with this bugzilla

Environment:

Kernel
# uname -a
Linux ltc-test-ci1 4.13.0-35-generic #39~16.04.1-Ubuntu SMP Mon Feb 12 15:01:58 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

Libvirt
# dpkg -l | grep libvirt
ii libvirt-bin 1.3.1-1ubuntu10.18 ppc64el programs for the libvirt library
ii libvirt-dev:ppc64el 1.3.1-1ubuntu10.18 ppc64el development files for the libvirt library
ii libvirt0:ppc64el 1.3....

Read more...

Revision history for this message
bugproxy (bugproxy) wrote : Ubuntu160404_dmesg

------- Comment (attachment only) From <email address hidden> 2018-02-19 03:38 EDT-------

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for the full dmesg.
It seems to me that:
"unable to set AppArmor profile 'libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442'"
means there is an issue in loading the profile after your change.

That matches:
 audit: type=1400 audit(1519028363.683:12417): apparmor="DENIED" operation="change_profile" info="label not found" error=-2 profile="/usr/sbin/libvirtd" name="libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442" pid=12949 comm="libvirtd"

It is not getting to the actual restore, it is failing when spawning the guest to to the changes in the apparmor profile.

I tried to check what you hit:
$ virsh save bionic-test --file /var/tmp/bionic-test.save --verbose
Guest is shut-off and I have
-rw------- 1 root root 527808329 Feb 19 12:34 /var/tmp/bionic-test.save
The restore hits the (silent) denial we discussed.
   #deny /tmp/{,**} r,
   #deny /var/tmp/{,**} r,
Changed the two lines above to a comment.
Then restored again, just worked:
$ virsh restore /var/tmp/bionic-test.save
Domain restored from /var/tmp/bionic-test.save

To quote jdstrand from bug 1403648:
"We should not allow access to /tmp and /var/tmp as that breaks application isolation."

That said we are in the following situation:
1. /tmp and /var/tmp are not allowed to be read (apparmor default for app isolation)
2. read denies there are silenced via explicit denies in /etc/apparmor.d/abstractions/libvirt-qemu
3. I see your point:
3.1 on save libvirt writes to that place (libvirt is allowed to do so, while qemu is not)
3.2 on restore qemu wants to read it and is denied.

And you wonder about the asymetric behavior of 3.1 and 3.2.
I agree that it is somewhat unexpected, but wonder what would be better
1. We could also deny /var /tmp for the lbivirt daemon (which intentionally has a rather lenient apparmor profile). Then already on the save people would be denied, maybe for a new release - but not as an SRU to not break people relying on that access working.
2. And on the new release we already have the --bypass-cache fixes you referred to to get the restore working there as a workaround - so the benefit of preventing libvirt to access there isn't too big either. So forbidding the access on "save" for libvirt there would make that useless.

I'm unsure how to continue. To better brain-storm with you on how to proceed do you have a clear preferred solution (other than the already included bypass-cache fixes) or is it just "not nice in general" that the denial should be consistent for save/restore?

Separate to the discussion above:
To find how your modified apparmor profile breaks your guest start you could share it - as I mentioned it worked for me right away (no need to restart libvirt after changing btw, the one we change it loaded on guest load).

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla
Download full text (7.8 KiB)

------- Comment From <email address hidden> 2018-02-27 08:04 EDT-------
With Ubuntu 18.04

1. without --bypass-cache qemu errors out,

# virsh save virt-tests-vm1 /var/tmp/save.file

Domain virt-tests-vm1 saved to /var/tmp/save.file

# virsh restore /var/tmp/save.file
error: Failed to restore domain from /var/tmp/save.file
error: internal error: qemu unexpectedly closed the monitor: 2018-02-27T12:54:55.160391Z qemu-system-ppc64: Not a migration stream
2018-02-27T12:54:55.160488Z qemu-system-ppc64: load of migration failed: Invalid argument

Does this error appropriate/expected ? Earlier Libvirt was handling the error for apparmor permission denial, but in bionic Qemu errors, I still feel the inconsistency and it would be better to document well to make sure users/customers to understand how it works / how to use

2. with --bypass-cache it works with /var/tmp

# virsh save virt-tests-vm1 /var/tmp/save.file --bypass-cache

Domain virt-tests-vm1 saved to /var/tmp/save.file

# virsh restore /var/tmp/save.file --bypass-cache
Domain restored from /var/tmp/save.file

3. in /var/lib/libvirt/images it works fine without issues,

# virsh save virt-tests-vm1 /var/lib/libvirt/images/save.file

Domain virt-tests-vm1 saved to /var/lib/libvirt/images/save.file

# virsh restore /var/lib/libvirt/images/save.file
Domain restored from /var/lib/libvirt/images/save.file

System Environment:

Libvirt
# dpkg -l | grep libvirt
ii gir1.2-libvirt-glib-1.0:ppc64el 1.0.0-1 ppc64el GObject introspection files for the libvirt-glib library
ii gir1.2-libvirt-sandbox-1.0 0.5.1+git20160404-1 ppc64el GObject introspection files for the libvirt-sandbox library
ii libvirt-bin 4.0.0-1ubuntu4 ppc64el programs for the libvirt library
ii libvirt-clients 4.0.0-1ubuntu4 ppc64el Programs for the libvirt library
ii libvirt-clients-dbgsym 4.0.0-1ubuntu4 ppc64el debug symbols for libvirt-clients
ii libvirt-daemon 4.0.0-1ubuntu4 ppc64el Virtualization daemon
ii libvirt-daemon-dbgsym 4.0.0-1ubuntu4 ppc64el debug symbols for libvirt-daemon
ii libvirt-daemon-driver-storage-gluster 4.0.0-1ubuntu4 ppc64el Virtualization daemon glusterfs storage driver
ii libvirt-daemon-driver-storage-gluster-dbgsym 4.0.0-1ubuntu4 ppc64el debug symbols for libvirt-daemon-driver-storage-gluster
ii libvirt-daemon-driver-storage-rbd 4.0.0-1ubuntu4 ppc64el Virtualization daemon RBD storage driver
ii libvirt-daemon-driver-storage-rbd-dbgsym 4.0.0-1ubuntu4 ppc64el debug symbols for libvirt-daemon-driver-storage-rbd
ii libvirt-daemon-driver-storage-sheepdog 4.0.0-1ubuntu4 ppc64el Virtualization daemon Sheedog storage driver
ii libvirt-daemon-driver-storage-sheepdog-dbgsym 4.0.0-1ub...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (3.2 KiB)

------- Comment From <email address hidden> 2018-02-27 08:24 EDT-------
(In reply to comment #61)
> Thanks for the full dmesg.
> It seems to me that:
> "unable to set AppArmor profile
> 'libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442'"
> means there is an issue in loading the profile after your change.
>
> That matches:
> audit: type=1400 audit(1519028363.683:12417): apparmor="DENIED"
> operation="change_profile" info="label not found" error=-2
> profile="/usr/sbin/libvirtd"
> name="libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442" pid=12949 comm="libvirtd"
>
> It is not getting to the actual restore, it is failing when spawning the
> guest to to the changes in the apparmor profile.
>
> I tried to check what you hit:
> $ virsh save bionic-test --file /var/tmp/bionic-test.save --verbose
> Guest is shut-off and I have
> -rw------- 1 root root 527808329 Feb 19 12:34 /var/tmp/bionic-test.save
> The restore hits the (silent) denial we discussed.
> #deny /tmp/{,**} r,
> #deny /var/tmp/{,**} r,
> Changed the two lines above to a comment.
> Then restored again, just worked:
> $ virsh restore /var/tmp/bionic-test.save
> Domain restored from /var/tmp/bionic-test.save
>
> To quote jdstrand from bug 1403648:
> "We should not allow access to /tmp and /var/tmp as that breaks application
> isolation."
>
> That said we are in the following situation:
> 1. /tmp and /var/tmp are not allowed to be read (apparmor default for app
> isolation)
> 2. read denies there are silenced via explicit denies in
> /etc/apparmor.d/abstractions/libvirt-qemu
> 3. I see your point:
> 3.1 on save libvirt writes to that place (libvirt is allowed to do so, while
> qemu is not)
> 3.2 on restore qemu wants to read it and is denied.
>
> And you wonder about the asymetric behavior of 3.1 and 3.2.
> I agree that it is somewhat unexpected, but wonder what would be better
> 1. We could also deny /var /tmp for the lbivirt daemon (which intentionally
> has a rather lenient apparmor profile). Then already on the save people
> would be denied, maybe for a new release - but not as an SRU to not break
> people relying on that access working.

Okay, Agreed.

> 2. And on the new release we already have the --bypass-cache fixes you
> referred to to get the restore working there as a workaround - so the
> benefit of preventing libvirt to access there isn't too big either. So
> forbidding the access on "save" for libvirt there would make that useless.

Anyway, when restore is denied in turn it would make save as useless is my point here.
let's document it in man page about --bypass-cache would help

>
> I'm unsure how to continue. To better brain-storm with you on how to proceed
> do you have a clear preferred solution (other than the already included
> bypass-cache fixes) or is it just "not nice in general" that the denial
> should be consistent for save/restore?

if it is possible to error out or warn in Libvirt when performing save in denial paths that this
would fail on restore by apparmor, then it would be a proper way.
>
> Separate to the discussion above:
> To find how your modified apparmor profile breaks your guest start you could
> share it - as I mentioned it worked for me right away (no need ...

Read more...

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (3.5 KiB)

------- Comment From <email address hidden> 2018-04-09 09:03 EDT-------
(In reply to comment #64)
> (In reply to comment #61)
> > Thanks for the full dmesg.
> > It seems to me that:
> > "unable to set AppArmor profile
> > 'libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442'"
> > means there is an issue in loading the profile after your change.
> >
> > That matches:
> > audit: type=1400 audit(1519028363.683:12417): apparmor="DENIED"
> > operation="change_profile" info="label not found" error=-2
> > profile="/usr/sbin/libvirtd"
> > name="libvirt-81b387d9-1dfc-4f55-8b98-0318f1f94442" pid=12949 comm="libvirtd"
> >
> > It is not getting to the actual restore, it is failing when spawning the
> > guest to to the changes in the apparmor profile.
> >
> > I tried to check what you hit:
> > $ virsh save bionic-test --file /var/tmp/bionic-test.save --verbose
> > Guest is shut-off and I have
> > -rw------- 1 root root 527808329 Feb 19 12:34 /var/tmp/bionic-test.save
> > The restore hits the (silent) denial we discussed.
> > #deny /tmp/{,**} r,
> > #deny /var/tmp/{,**} r,
> > Changed the two lines above to a comment.
> > Then restored again, just worked:
> > $ virsh restore /var/tmp/bionic-test.save
> > Domain restored from /var/tmp/bionic-test.save
> >
> > To quote jdstrand from bug 1403648:
> > "We should not allow access to /tmp and /var/tmp as that breaks application
> > isolation."
> >
> > That said we are in the following situation:
> > 1. /tmp and /var/tmp are not allowed to be read (apparmor default for app
> > isolation)
> > 2. read denies there are silenced via explicit denies in
> > /etc/apparmor.d/abstractions/libvirt-qemu
> > 3. I see your point:
> > 3.1 on save libvirt writes to that place (libvirt is allowed to do so, while
> > qemu is not)
> > 3.2 on restore qemu wants to read it and is denied.
> >
> > And you wonder about the asymetric behavior of 3.1 and 3.2.
> > I agree that it is somewhat unexpected, but wonder what would be better
> > 1. We could also deny /var /tmp for the lbivirt daemon (which intentionally
> > has a rather lenient apparmor profile). Then already on the save people
> > would be denied, maybe for a new release - but not as an SRU to not break
> > people relying on that access working.
>
> Okay, Agreed.
>
> > 2. And on the new release we already have the --bypass-cache fixes you
> > referred to to get the restore working there as a workaround - so the
> > benefit of preventing libvirt to access there isn't too big either. So
> > forbidding the access on "save" for libvirt there would make that useless.
>
> Anyway, when restore is denied in turn it would make save as useless is my
> point here.
> let's document it in man page about --bypass-cache would help
>
> >
> > I'm unsure how to continue. To better brain-storm with you on how to proceed
> > do you have a clear preferred solution (other than the already included
> > bypass-cache fixes) or is it just "not nice in general" that the denial
> > should be consistent for save/restore?
>
> if it is possible to error out or warn in Libvirt when performing save in
> denial paths that this
> would fail on restore by apparmor, then it would be a proper way.
> >
> > Separate to the dis...

Read more...

Changed in libvirt (Ubuntu):
assignee: ChristianEhrhardt (paelzer) → nobody
Revision history for this message
Andrew Cloke (andrew-cloke) wrote :

From comment #22: "Distro team, Is there any update based on the previous update we posted. ?"

I'm afraid this bug report is now a little complex and involved, and it is now marked as "Fix Released" in Launchpad.

Could you please be specific about the comment or question that you require more information about? Alternatively, it may be best to raise a new bug report.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

TBH - I haven't taken the former comment as a call for further action.
It was more of a summary how docs and output could be better.

Let me answer:

1. document that --bypass-cache would help

Yeah it might be nice, but then it is just such a general thing.
It only affects apparmor users (not all libvirt users).
It only affects /tmp wi
I wonder how such a hint might look like.
Checking the doc there is a Note on disk corruption for virsh restore - maybe there as another Note entry.
But I'm still not all in for this.

2. on older releases "error out or warn in Libvirt when performing save in denial paths"

It is not really possible to predetect and differentiate if such a denial was the reason.
Looking into the future I think we might use per-guest overrides.

I was thinking on that more, the fact that all other but /tmp (for the explicit deny) just work, like:
 $ virsh save xenial-testshutdown-0 /var/anythingbuttmp.state
 $ virsh restore /var/anythingbuttmp.state
That annoys me a lot.

I'd suggest otherwise, we keep the past as it is without modifying man pages or anything like it (after all it is no regression I can SRU and a very special case choosing /tmp only).
But I want to make it better thinking forward.
I thought about it again and again, and revisited the old bug that added those deny rules.
I think it is time to take them out in the next release.

That would mean it would generally work, and even if there is a deny it would at least be in the log.
See also:
- https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648/comments/6
- https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648/comments/12

I think the old assumptions don't hold true.
So for the current and stable-releases we keep it as is, to not regress anyone (with too much logs).
But forward I'd drop the deny rules and then all of this (and similar things where users WANT e.g. images in /tmp) would work.

Part of it would be to check (way more modern and recent) openstack that it no more has those issues and if it has as part of the fix look for something better e.g. adapt how openstack sets the ceph config to no more trigger /tmp /tmp/var access.

There are also rules like owner /tmp/pulse-*/ rw, in the meantime which get trumped by the deny.
TL;DR - taking out the deny and making the save/restore case of this bug no more a special case would be much better IMHO.

If you are ok with that I'd create a new bug to:
1. take out the deny rules to /tmp early in Ubuntu 18.10
2. do an analysis with recent openstack+ceph if they still trigger access there

So are you ok with that approach?

P.S. If you really really (...really*) want/need a man page entry for this special case we could work something out, but I think that would not qualify as an SRU [1] so thinking forward is much better anyway.

[1]: https://wiki.ubuntu.com/StableReleaseUpdates

Revision history for this message
Anish Gupta (anish70) wrote :

Christian,

Seeing following errors on our end as well - fyi

Create/delete of our XXL Guest vm.

Sep 1 08:54:54 my-system libvirtd[4126]: 2018-09-01 15:54:54.426+0000: 4126: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error
Sep 1 08:54:55 my-system libvirtd[4126]: 2018-09-01 15:54:55.009+0000: 4126: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error
Sep 1 08:54:55 my-system libvirtd[4126]: 2018-09-01 15:54:55.177+0000: 4126: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error
Sep 1 08:54:55 my-system libvirtd[4126]: 2018-09-01 15:54:55.325+0000: 4126: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error
Sep 1 08:54:56 my-system libvirtd[4126]: 2018-09-01 15:54:56.936+0000: 4126: error : qemuMonitorIORead:610 : Unable to read from monitor: Connection reset by peer
Sep 1 08:55:09 my-system libvirtd[4126]: 2018-09-01 15:55:09.503+0000: 4202: error : virProcessKillPainfully:401 : Failed to terminate process 48500 with SIGKILL: Device or resource busy
Sep 1 08:55:16 my-system libvirtd[4126]: 2018-09-01 15:55:16.959+0000: 4201: error : virSecurityDACSetOwnershipInternal:619 : unable to set user and group to '0:0' on '/var/lib/libvirt/images/dgx2vm-labSat0847-16g0-15_dgx-kvm-image-4.0.0~180823-22df0e.1.qcow2': No such file or directory
Sep 1 08:55:16 my-system libvirtd[4126]: 2018-09-01 15:55:16.959+0000: 4201: error : virSecurityDACSetOwnershipInternal:619 : unable to set user and group to '0:0' on '/raid/dgx-kvm/vol-dgx2vm-labSat0847-16g0-15': No such file or directory
Sep 1 09:22:11 my-system libvirtd[4126]: 2018-09-01 16:22:11.098+0000: 4203: error : virProcessKillPainfully:401 : Failed to terminate process 64980 with SIGKILL: Device or resource busy
Sep 1 09:22:11 my-system libvirtd[4126]: 2018-09-01 16:22:11.099+0000: 4126: error : qemuMonitorIO:719 : internal error: End of file from qemu monitor
Sep 1 09:22:26 my-system libvirtd[4126]: 2018-09-01 16:22:26.110+0000: 51440: error : virProcessKillPainfully:401 : Failed to terminate process 64980 with SIGKILL: Device or resource busy

thanks,
Anish

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Anish, IMHO none of the messages are related to this bug.
I'll reply per mail with some info.

Revision history for this message
bugproxy (bugproxy) wrote :
Download full text (3.3 KiB)

------- Comment From <email address hidden> 2019-01-23 23:48 EDT-------
(In reply to comment #68)
> TBH - I haven't taken the former comment as a call for further action.
> It was more of a summary how docs and output could be better.
>
> Let me answer:
>
> 1. document that --bypass-cache would help
>
> Yeah it might be nice, but then it is just such a general thing.
> It only affects apparmor users (not all libvirt users).
> It only affects /tmp wi
> I wonder how such a hint might look like.
> Checking the doc there is a Note on disk corruption for virsh restore -
> maybe there as another Note entry.
> But I'm still not all in for this.

Okay, fine.

>
> 2. on older releases "error out or warn in Libvirt when performing save in
> denial paths"
>
> It is not really possible to predetect and differentiate if such a denial
> was the reason.
> Looking into the future I think we might use per-guest overrides.
>
> I was thinking on that more, the fact that all other but /tmp (for the
> explicit deny) just work, like:
> $ virsh save xenial-testshutdown-0 /var/anythingbuttmp.state
> $ virsh restore /var/anythingbuttmp.state
> That annoys me a lot.
>
> I'd suggest otherwise, we keep the past as it is without modifying man pages
> or anything like it (after all it is no regression I can SRU and a very
> special case choosing /tmp only).
> But I want to make it better thinking forward.
> I thought about it again and again, and revisited the old bug that added
> those deny rules.
> I think it is time to take them out in the next release.

Agree with you on this.

>
> That would mean it would generally work, and even if there is a deny it
> would at least be in the log.
> See also:
> - https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648/comments/6
> - https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1403648/comments/12
>
> I think the old assumptions don't hold true.
> So for the current and stable-releases we keep it as is, to not regress
> anyone (with too much logs).
> But forward I'd drop the deny rules and then all of this (and similar things
> where users WANT e.g. images in /tmp) would work.

ACK

>
> Part of it would be to check (way more modern and recent) openstack that it
> no more has those issues and if it has as part of the fix look for something
> better e.g. adapt how openstack sets the ceph config to no more trigger /tmp
> /tmp/var access.
>
> There are also rules like owner /tmp/pulse-*/ rw, in the meantime which get
> trumped by the deny.
> TL;DR - taking out the deny and making the save/restore case of this bug no
> more a special case would be much better IMHO.
>
> If you are ok with that I'd create a new bug to:
> 1. take out the deny rules to /tmp early in Ubuntu 18.10

Yes, I am okay on it.

> 2. do an analysis with recent openstack+ceph if they still trigger access
> there

well, I am not working on openstack and ceph. so could not comment on it.

>
> So are you ok with that approach?

:+1: for 1

>
> P.S. If you really really (...really*) want/need a man page entry for this
> special case we could work something out, but I think that would not qualify
> as an SRU [1] so thinking forward is much better anyway.

...

Read more...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Thanks for agreeing to my suggested approach to resolve all of this!

I have already taken out the deny rules since Cosmic and we had no bad feedback that ceph (the original reason to add them) would trigger log storms.

Therefore this bug is completely done.
Status already reflects this, I'll remove David as assignee to completely clean up.

Changed in ubuntu-power-systems:
assignee: David Britton (davidpbritton) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.