qcow format could expose host filesystem information

Bug #853330 reported by Scott Moser
274
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Scott Moser

Bug Description

I've only just thought of this issue, and have not investigated further. There may be a security issue in enabling qcow images that could be exploited like this:

 * user creates an image with 'qemu-img create -f qcow2 -b <FILE> mytest.img' . Here, FILE could be anything thing (/etc/shadow, /var/lib/libvirt/qemu/isntance-0000030/disk.local).
 * user uploads images: cloud-publish-image x86_64 mytest.img mybucket
 * user runs instance

Things that may make this less severe:
 * i believe in ubuntu that the user libvirt-qemu user is used to run kvm, so that the file access might be limited to what that user can access.
 * in ubuntu apparmor profiles for libvirt *may* improve this, although I don't know that they were written to protect against this.
 * the instance wont' actually boot. (However, I think it may be possible using block-device-mapping to attach a bootable snapshot or volume to this instance that *would* boot, and then access the root disk that way).
 * this only affects installs where 'use_cow_images' is True. If it is false, then libvirt wills specify to kvm that the disk is 'raw' and kvm will not invoke the qemu code path which would read the backing store.

I believe this could be fixed by simply refusing to run a qcow2 formated image (or any disk image) whose original image contained a backing store reference . Ie, if 'qemu-img info <disk>' showed a 'backing file:' entry.

Related branches

CVE References

Scott Moser (smoser)
description: updated
Revision history for this message
Scott Moser (smoser) wrote :

OK. So I'm almost 100% certain that this is a valid attack at this point, but still have not tested. See if you can tell me why the following *wouldn't* work.
Below:
  * FILE is a path to something that you should not be able to read. '/dev/sda' or '/path/to/another/instances/i-0000002/disk' or possibly /etc/shadow.
  * $KERNEL is probably a normal kernel possibly a distro built one
  * $RAMDISK is a fully functional OS in a ramdisk (with an ssh server to reach it), but you probably don't even need this if you have vnc console access (to interactively 'fix' possibly failed initramfs boot).
  * FILE=/dev/sda (or /path/to/likely/other/users/instance/disk)

So, to attack:
 * qemu-img create -f qcow2 -b $FILE my-attack.img
 * cloud-publish-image --kernel-file $KERNEL --ramdisk-file $RAMDISK x86_64 $FILE my-attack-bucket
 * euca-run-instances ami-abcdefg
 * ssh to new (initramfs based rootfs), and start reading from $FILE

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

This is precisely the issue that http://www.ubuntu.com/usn/usn-1008-1/ addressed in libvirt. qemu can use anything as a backing store, so if a guest used a raw image, it could modify the data structures of the disk to look like a qemu qcow2 file and point somewhere else. On the next guest boot after a shutdown (not reboot), libvirt would examine all the backing stores and update the apparmor dynamic profile. Running as the libvirt-qemu user helps here, cause DAC would prevent access (but when running as root, like in Lucid, this was more of a problem). libvirt was updated in the aforementioned security update to require that the domain xml match the disk. Eg, in order to use a qemu image with backing store, the domain xml must have "<driver name='qemu' type='qcow2'>". This makes it was that if a machine with "<driver name='qemu' type='raw'>" changes its on-disk format, it will fail to start because the disk (which is now a qcow2) does not match the domain xml. Guests that are already using qcow2 images are not able to change their on disk format (and if they are, qemu-kvm would get a CVE allocated for this). Finally, if using the AppArmor driver, there are paths that the guest is not allowed to access on the host (see valid_path() in virt-aa-helper.c), eg files in /etc, /proc and /sys. The blacklist is incomplete (it must be, since users need to be allowed to put their disks, etc almost anywhere on the system), but it does prevent a lot of access.

As for qemu-kvm itself, normal DAC is in effect. If the user invoking qemu-kvm has privilege, then that user can use qemu-kvm to have privileged access to files via qemu's backing store mechanism. Keep in mind, that user must already have the privileged access in the first place, so there is no privilege escalation or crossing of privilege boundaries.

One thing to keep in mind is that all members of the libvirtd group should be considered privileged (ie 'root', because of access to disks), as discussing in /usr/share/doc/libvirt-bin/README.Debian. Also keep in mind that libvirt and its AppArmor driver are not protecting the administrator (ie, a person in the libvirtd group) from herself or guests from the host, but instead protecting guests from each other and the host from the guests.

Keeping all of the above in mind, within the context of Ubuntu, libvirt and qemu, there is no privilege escalation or crossing of privilege boundaries.

Within the context of nova, the cloud-publish-image should either enforce raw disks only, or do rigorous input validation on the disk file and reject files with a backing store at all or enforce that backing stores be within a particular path. The qemu-img command can aid in this:
$ qemu-img info ./disk0.qcow2
image: ./disk0.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 14M
cluster_size: 65536
backing file: <path>/disk0.pristine.qcow2 (actual path: <path>/disk0.pristine.qcow2)

Revision history for this message
Scott Moser (smoser) wrote :

"Keeping all of the above in mind, within the context of Ubuntu, libvirt and qemu, there is no privilege escalation or crossing of privilege boundaries."

I disagree. At very least the launcher of one instance can potentially access data from another user's instance. I see nothing above stopping them from getting raw read access to the hosts root disk.

Nova has a setting 'use_cow_images' which defaults to true. If that is set, then on instance creation, nova does: qemu-img create -f qcow2 -b /path/to/pristine-disk-image /path/to/instance/cow/disk. It passes appropriate xml to libvirt to say "this is a qcow disk" and thus kvm gets that as well. Essentially the user does not need to change the disk format. It is, as written by nova, 'qcow2'.

If the user uploads a qcow image that contains in it a backing store reference to '/dev/sda' or to /var/lib/nova/instances/i-000003/disk (a suspected path to another user's disk), then nova will launch a libvirt/kvm instance with a qcow disk image that has 2 levels of backing store. The first merely references the second (which is what the user uploaded) and the second references a secret file on disk.

Unless apparmor/ubuntu/kvm explicitly disallow 2 level qcow images, then this is valid.

Revision history for this message
Scott Moser (smoser) wrote :

Originally I thought it would be enough to simply:
  * check if image had a backing store, if not, convert it to raw
  * use the raw image

However, this shows that that is insufficient, as the attacker could put inside the first level qcow image a second level qcow image that. That one would survive.

So, my revised solution would be to:
 * check if image has backing store, if so, convert it to raw
 * check file format of 'raw' image, if it is not raw, raise error
 * use raw image

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

So that comment was not as clear as it should have been. At first I was thinking this was a problem with ec2, and then realized it was a bug with nova but didn't update the comment accordingly. As such, I was trying to say 'in Ubuntu, there is no problem with libvirt and qemu-kvm on their own'. In other words, libvirt and qemu-kvm cannot be modified to enforce behavior around poor/non-existent input validation by tools which use it and down below ('within the context of nova'), you see I actually do agree with you regarding nova.

If nova is using 'use_cow_images' as true by default, it *must* do input validation on the qcow2 images, otherwise the attack you describe will work. An easy approach is simply to say it is not allowed to have a backing store file. If it does, reject it. A harder approach would be to say it is ok to have a backing store file, but it must be in this directory with a unique filename. You have to be careful here because you don't want to just say "this directory" otherwise the qcow2 could point to another guest's file within that directory, thus breaking privilege boundaries. nova must be the one doing this input validation (apart from it being best practice) because if it is in the libvirtd group, it is privileged and must handle allocation of VM disks and files properly (see previous comment).

Scott, should the security contact for nova be subscribed to this bug?

Revision history for this message
Scott Moser (smoser) wrote :

i believe that 'nova-core' is the security contact. I opened this as a security bug against 'nova', so i would have thought that would be the default behavior of launchpad.

Scott Moser (smoser)
Changed in nova:
assignee: nobody → Scott Moser (smoser)
status: New → In Progress
Soren Hansen (soren)
Changed in nova:
importance: Undecided → High
Revision history for this message
Marc Deslauriers (mdeslaur) wrote :

I have assigned CVE-2011-3147 to this issue in nova. Please specify this CVE number when discussing this issue.

Revision history for this message
Scott Moser (smoser) wrote :

I'm attaching a proposed patch that seems to work for me, and passes tests.

Revision history for this message
Brian Waldon (bcwaldon) wrote :

Targeted to diablo to get this on everybody's radar.

Changed in nova:
milestone: none → 2011.3
Scott Moser (smoser)
description: updated
Revision history for this message
Vish Ishaya (vishvananda) wrote :

scott, can you branch off of rdp and propose into trunk and milestone_proposed? We need to get this in.

Revision history for this message
Thierry Carrez (ttx) wrote :

This should be proposed as a bugfix branch, like "do not use backing store in qcow files, LP: #853330" -- that way the fix looks innocent enough for the 5 minutes it will be exposed to the public before it gets reviewed and applied. Having pre-approval of the patch here would help, though, in reducing the window.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

in the interest of pre-approval, please don't import the exception directly and use:

from nova import exception
raise exception.ImageUncacceptable(...)

thx

Revision history for this message
Scott Moser (smoser) wrote :

ok. so i've not been able to actually exploit this on my oneiric development system which is running trunk with 'novascript'.

What I was trying was:
 * upload a kernel $AKI
 * create ramdisk $ARI that functions as a root device with ssh (i was able to re-use the root filesystem from ttylinux.tar.gz image, I just had to create a symlink to /sbin/init from /init and bundle it as a initramfs friendly cpio archive).

After that, I can boot an instance with that kernel and intiramfs and have access to the disk that I upload.

Then, i had a running instance that has access /dev/vda.

However, when i create a image like 'qemu-img create -f qcow2 -b /home/ubuntu/src/nova/instances/instance-00000054/disk backed-by-i054.img' and upload that with the given AKI and ARI from above, I get permission denied errors in the log. This is obviously good, but i'm not quite sure why i'm getting them.

So, to be fair, at this point i have not actually been able to exploit this.

Revision history for this message
Scott Moser (smoser) wrote :

locally i fixed the import format suggestion from comment 12. I'm ready to push this to a branch both trunk and milestone whenver.

Revision history for this message
Scott Moser (smoser) wrote :

revised patch addresses 'import' of exception, and uses '_(' for exception reasons.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

+ reason=_("Dangerous! fmt=%s with backing file: %s") %
+ (fmt, data['backing file']))

needs to use dictionary format when there are multiple replacements

Revision history for this message
Scott Moser (smoser) wrote :

OK. i have that fixed in local branch, and tests run against both milestone and trunk. and i've tested the changes do what i want with trunk. I'll attach both patches.

Revision history for this message
Scott Moser (smoser) wrote :
Revision history for this message
Scott Moser (smoser) wrote :
Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Scott Moser (smoser) wrote :

The fix for this bug is committed to
 * trunk at revision 1604 [1]
 * diablo milestone branch at revision 1192 [2]

[1] http://bazaar.launchpad.net/~hudson-openstack/nova/trunk/revision/1604
[2] http://bazaar.launchpad.net/~hudson-openstack/nova/milestone-proposed/revision/1192

Thierry Carrez (ttx)
visibility: private → public
Changed in nova:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.