lxc container can control other container's cpu share,memory limit,or access of block and character devices

Bug #1088295 reported by Lawrance
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openstack-manuals
Fix Released
High
Anne Gentle
libvirt (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

i install openstack with nova-compute-lxc.as we know openstack use cgroup limit the resource of lxc instance,but when i installed cgroup-bin in the lxc instance ,i can control other container's cpu share,memory limit,or access of block and character devices and etc.
i suspect that host and lxc instance share the cgroup as it's kernel based?

in the lxc instance:
#show four instaces,and we are in the instance named "instance-0000011e"
root@openstack:/sys/fs/cgroup/devices/libvirt/lxc# ll
total 0
drwxr-xr-x 6 root root 0 Dec 7 16:19 ./
drwxr-xr-x 4 root root 0 Dec 7 12:45 ../
-rw-r--r-- 1 root root 0 Dec 7 12:45 cgroup.clone_children
--w--w--w- 1 root root 0 Dec 7 12:45 cgroup.event_control
-rw-r--r-- 1 root root 0 Dec 7 12:45 cgroup.procs
--w------- 1 root root 0 Dec 7 12:45 devices.allow
--w------- 1 root root 0 Dec 7 12:45 devices.deny
-r--r--r-- 1 root root 0 Dec 7 12:45 devices.list
drwxr-xr-x 3 root root 0 Dec 7 16:19 instance-0000011e/
drwxr-xr-x 2 root root 0 Dec 7 16:22 instance-0000011f/
drwxr-xr-x 3 root root 0 Dec 7 17:28 instance-00000120/
drwxr-xr-x 2 root root 0 Dec 10 08:35 instance-00000121/
-rw-r--r-- 1 root root 0 Dec 7 12:45 notify_on_release
-rw-r--r-- 1 root root 0 Dec 7 12:45 tasks

#we can see instance-00000121's devices.list, and we can mknod /dev/kvm in instance-00000121
root@openstack:/sys/fs/cgroup/devices/libvirt/lxc# cat instance-00000121/devices.list
c 10:* rwm
c 1:3 rwm
c 1:5 rwm
c 1:7 rwm
c 1:8 rwm
c 1:9 rwm
c 5:0 rwm
c 5:2 rwm
c 136:* rwm

#change the device list of instance-00000121 and resee the device list,and now we CAN NOT mknod /dev/kvm
root@openstack:/sys/fs/cgroup/devices/libvirt/lxc# echo "c 10:* rwm" >> instance-00000121/devices.deny
root@openstack:/sys/fs/cgroup/devices/libvirt/lxc# cat instance-00000121/devices.list
c 1:3 rwm
c 1:5 rwm
c 1:7 rwm
c 1:8 rwm
c 1:9 rwm
c 5:0 rwm
c 5:2 rwm
c 136:* rwm

memory limit and cpu share can did the same thing,and if i use native lxc,the same problem will appear,
have i reported the right bug in the right place?

Tags: nova lxc
Lawrance (jing)
Changed in nova:
status: New → Confirmed
tags: added: lxc
information type: Private Security → Public
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks, this is because per-container apparmor policies are not yet enabled in libvirt-lxc, as they are in lxc.

This can be solved either with apparmor, or (sometime before 14.04) with user namespaces.

Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Lawrance (jing) wrote :

thanks for your rapid reply.
sorry, i'm newbie to appamor

1. what i should do is to create a appamor policy for /usr/lib/libvirt/libvirt_lxc or anything else?
2. how can i do per-container apparmor policies
3. could i refer below appamor policy for lxc
root@superstack:~# cat /etc/apparmor.d/lxc/lxc-default
# Do not load this file. Rather, load /etc/apparmor.d/lxc-containers, which
# will source all profiles under /etc/apparmor.d/lxc

profile lxc-container-default flags=(attach_disconnected,mediate_deleted) {
  network,
  capability,
  file,
  umount,

  # ignore DENIED message on / remount
  deny mount options=(ro, remount) -> /,

  # allow tmpfs mounts everywhere
  mount fstype=tmpfs,

  # allow mqueue mounts everywhere
  mount fstype=mqueue,

  # allow fuse mounts everywhere
  mount fstype=fuse.*,

  # the container may never be allowed to mount devpts. If it does, it
  # will remount the host's devpts. We could allow it to do it with
  # the newinstance option (but, right now, we don't).
  deny mount fstype=devpts,

  # allow bind mount of /lib/init/fstab for lxcguest
  mount options=(rw, bind) /lib/init/fstab.lxc/ -> /lib/init/fstab/,

  # deny writes in /proc/sys/fs but allow fusectl to be mounted
  mount fstype=binfmt_misc -> /proc/sys/fs/binfmt_misc/,
  deny @{PROC}/sys/fs/** wklx,

  # block some other dangerous paths
  deny @{PROC}/sysrq-trigger rwklx,
  deny @{PROC}/mem rwklx,
  deny @{PROC}/kmem rwklx,
  deny @{PROC}/sys/kernel/[^s][^h][^m]* wklx,
  deny @{PROC}/sys/kernel/*/** wklx,

  # deny writes in /sys except for /sys/fs/cgroup, also allow
  # fusectl, securityfs and debugfs to be mounted there (read-only)
  mount fstype=fusectl -> /sys/fs/fuse/connections/,
  mount fstype=securityfs -> /sys/kernel/security/,
  mount fstype=debugfs -> /sys/kernel/debug/,
  deny mount fstype=debugfs -> /var/lib/ureadahead/debugfs/,
  mount fstype=proc -> /proc/,
  mount fstype=sysfs -> /sys/,
  deny /sys/[^f]*/** wklx,
  deny /sys/f[^s]*/** wklx,
  deny /sys/fs/[^c]*/** wklx,
  deny /sys/fs/c[^g]*/** wklx,
  deny /sys/fs/cg[^r]*/** wklx,
}

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1088295] Re: lxc container can control other container's cpu share, memory limit, or access of block and character devices

Quoting Lawrance (<email address hidden>):
> thanks for your rapid reply.
> sorry, i'm newbie to appamor
>
> 1. what i should do is to create a appamor policy for /usr/lib/libvirt/libvirt_lxc or anything else?

libvirt_lxc sets up the container which requires much more privilege than
the container itself should have. In the lxc package, the program which
starts the container (equivalent of /usr/lib/libvirt/libvirt_lxc) enters
a temporary domain automatically when it starts, then right before it
executes /sbin/init in the container the code is changed to manually
enter the container's domain.

> 2. how can i do per-container apparmor policies
> 3. could i refer below appamor policy for lxc
> root@superstack:~# cat /etc/apparmor.d/lxc/lxc-default

The policy itself should be a good start for the restrictions you'll
want on containers. However, libvirt already has a sophisticated
security module infrastructure which should probably be extended for
libvirt-lxc.

For a temporary custom solution, it may be possible to create a
domain based upon /etc/apparmor.d/usr.bin.lxc-start, which modified
to automatically switch to /etc/apparmor.d/lxc/lxc-default on
executing /sbin/init.

Revision history for this message
Lawrance (jing) wrote :

thanks Serge,i’ll try

Revision history for this message
Thierry Carrez (ttx) wrote :

Serge: is there anything we can do on the Nova side of things ? Looks like this has security implications ?

Changed in nova:
status: Confirmed → Incomplete
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

It definately has security implications. The apparmor profile is the primary way we protect the host from a guest with the lxc package (which openstack does not use), preventing things like writing to /proc/sysrq-trigger.

Nova could move containers into a container apparmor profile itself after starting them... Note that some things will end up not being possible by default - for instance an lxc guest won't be able to install libvirt or lxc because they need to mount cgroups, which is not safe.

But the "right" solution is to implement the libvirt-lxc security operations for apparmor, or to implement the libvirt driver to use lxc from the lxc package.

Revision history for this message
Daniel Berrange (berrange) wrote :

> Serge: is there anything we can do on the Nova side of things ? Looks like this has security implications ?

Providing sVirt support in libvirt, mitigates against the lack of security for containers in the kernel, but this is at best a band-aid. Ultimately, we need the usernamespace work completed to allow LXC to be considered remotely secure & production ready.

We should make sure our release notes explicitly tell people that LXC is not a secure virtualization technology and discourage its use in production environments.. I try to get this message across as widely as possible, but it still gets lost.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Daniel Berrange (<email address hidden>):
> > Serge: is there anything we can do on the Nova side of things ? Looks
> like this has security implications ?
>
> Providing sVirt support in libvirt, mitigates against the lack of
> security for containers in the kernel, but this is at best a band-aid.
> Ultimately, we need the usernamespace work completed to allow LXC to be

For the record, most of it actually has landed upstream (last week).

Revision history for this message
Thierry Carrez (ttx) wrote :

Yes that needs to be pretty apparent from our documentation. I'm creating a doc task for that...

affects: nova → openstack-manuals
Changed in openstack-manuals:
importance: Undecided → High
status: Incomplete → Confirmed
Tom Fifield (fifieldt)
tags: added: nova
Anne Gentle (annegentle)
Changed in openstack-manuals:
assignee: nobody → Anne Gentle (annegentle)
Revision history for this message
Tom Fifield (fifieldt) wrote :
Changed in openstack-manuals:
status: Confirmed → In Progress
Revision history for this message
Thierry Carrez (ttx) wrote :

Note that the OpenStack Security Group (OSSG) might also issue a security notice about that.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-manuals (master)

Reviewed: https://review.openstack.org/18788
Committed: http://github.com/openstack/openstack-manuals/commit/6b188da11ca022a98463cdcd1652b919c5db74dc
Submitter: Jenkins
Branch: master

commit 6b188da11ca022a98463cdcd1652b919c5db74dc
Author: annegentle <email address hidden>
Date: Mon Dec 31 14:38:36 2012 -0600

    Adds fair warnings about LXC not recommended for use in production.

    Fix bug 1088295

    Patch set adds Thierry's suggested edits.

    Change-Id: If9a215b90649110aaee8a5095c3874ad22a9f8f8

Changed in openstack-manuals:
status: In Progress → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 1.2.8-0ubuntu1

---------------
libvirt (1.2.8-0ubuntu1) utopic; urgency=medium

  [ Chuck Short ]
  * New upstream release: (LP: #1367422)
    + Dropped:
      - debian/patches/ovs-delete-port-if-exists-while-adding-new-one
    + Refreshed:
      - debian/patches/add-cgmanager-support.patch
      - debian/patches/storage-default-permission-mode-to-0711

  [ Serge Hallyn ]
  * d/apparmor
    - install TEMPLATE.qemu and TEMPLATE.lxc
    - add libvirt-lxc abstraction, add permissions to it needed for
      a ubuntu container to start.
    - libvirt-qemu - add qemu-bridge-helper policy from upstream
    - libvirt-qemu - add qemu-microblaze allows from upstream
    - edit lxc.conf to enable apparmor by default (LP: #914716)
      (LP: #1008393) (LP: #1088295)
  * d/apparmor/libvirt-qemu: add /dev/shm as path to spice.* nodes
    for systemd case. (LP: #1365163)
  * d/p/9030-create-socket-dir - create session socket dir if
    needed (Should be replaced eventually by the upstream fix)
  * d/p/9032-lxc-allow-no-security-driver: don't fail if apparmor
    driver is not available (else the qa-regression-tests fail with
    skip_apparmor)
 -- Serge Hallyn <email address hidden> Mon, 15 Sep 2014 18:30:06 -0500

Changed in libvirt (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.