Unprivileged LXC will not start after today's updates

Bug #1549363 reported by Christopher Townsend
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
High
Unassigned

Bug Description

After today's (Feb. 24, 2016) updates, unprivileged LXC's will no longer start. Attaching 'start_lxc.out' debug log as well.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: lxc 2.0.0~rc2-0ubuntu2
ProcVersionSignature: Ubuntu 4.4.0-7.22-generic 4.4.2
Uname: Linux 4.4.0-7-generic x86_64
ApportVersion: 2.20-0ubuntu3
Architecture: amd64
CurrentDesktop: Unity
Date: Wed Feb 24 11:16:57 2016
InstallationDate: Installed on 2013-03-18 (1072 days ago)
InstallationMedia: Ubuntu 12.10 "Quantal Quetzal" - Release amd64 (20121017.5)
PackageArchitecture: all
SourcePackage: lxc
UpgradeStatus: Upgraded to xenial on 2015-10-28 (119 days ago)
modified.conffile..etc.apparmor.d.lxc.lxc.default.with.nesting: [modified]
modified.conffile..etc.default.lxc: [modified]
mtime.conffile..etc.apparmor.d.lxc.lxc.default.with.nesting: 2016-02-22T17:39:08
mtime.conffile..etc.default.lxc: 2016-02-22T17:39:08

Revision history for this message
Christopher Townsend (townsend) wrote :
description: updated
Revision history for this message
Christopher Townsend (townsend) wrote :

After the latest lxc updates (2.0.0~rc3-0ubuntu1), a new failure is now occurring. See attached debug log.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxc (Ubuntu):
status: New → Confirmed
Revision history for this message
William Grant (wgrant) wrote :

The cgroup errors go away if I manually create and chown a few parent cgroups, but then the apparmor error returns.

Revision history for this message
William Grant (wgrant) wrote :

lxc-container-default-cgns doesn't show up in apparmor_status, despite the other three being there.

If I hack a container's config to use lxc-container-default rather than lxc-container-default-cgns, hack that profile to allow cgfs, and start that container, lxc-container-default-cgns shows up and other unmodified containers can now start. Weird.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Hmm, does /etc/init.d/apparmor reload fix the profile not being loaded?

We've not been introducing new profiles very often and those profiles are loaded through apparmor includes so I can certainly see a standard dh_apparmor being confused by it and not reloading everything properly on upgrade.

But forcing an apparmor reload or rebooting the machine sure should fix that.

Revision history for this message
Stéphane Graber (stgraber) wrote :

As for the cgroups, we've noticed at least one issue in the cgfs logic of LXC which was fixed earlier today, the package is still going through QA (currently in proposed), should make it to the release pocket within a couple of hours.

The fix was specifically to fix unprivileged but root-owned containers failing to start due to using the wrong cgroup paths. I'm not sure if that covers this bug's specific case. If it doesn't then we'll have to look at this more closely.

Note that those regressions are showing up as a result of us removing cgmanager and switching to straight cgroupfs, things also got slightly more messy as that particular LXC change ended up landing right around the same time as the first cgns enabled kernel which also happened to be broken when used in unprivileged containers.

So what we know right now is:
 - lxc prior to 2.0.0~rc3-0ubuntu2 will fail to setup cgroups for unprivileged containers spawned by the root user, leading to container startup failures
 - linux prior to 4.4.0-8-generic will fail to mount cgroupfs inside unprivileged containers, leading to container starting up but pid1 immediately failing and no other processes getting spawned.

Revision history for this message
William Grant (wgrant) wrote :

I grabbed 2.0.0~rc3-0ubuntu2 before I wrote those comments, as I noticed the cgfs changes. My last reboot was this morning, between upgrading to -0ubuntu1 and -0ubuntu2, so I think the apparmor stuff is still broken. The profile doesn't even appear if I explicitly apparmor_parser -r, -a, -R in any combination, though --debug definitely shows it parsing that file. It's very, very weird.

This is all with Linux 4.4.0-7, since I hadn't noticed 4.4.0-8 sitting in -proposed. I might retry with that.

Revision history for this message
Stéphane Graber (stgraber) wrote :

That's very weird, all my xenial test systems show lxc-container-default-cgns in the apparmor_status output

Revision history for this message
Christopher Townsend (townsend) wrote :

I have rebooted my machine multiple times and the issue(s) still occur.

Revision history for this message
Christopher Townsend (townsend) wrote :

I updated to the latest LXC (2.0.0~rc3-0ubuntu2) and the cgroup permission issue still occurs. The unprivileged containers are created by the user and started by the user, ie, not root.

Changed in lxc (Ubuntu):
importance: Undecided → High
Revision history for this message
Christopher Townsend (townsend) wrote :

The containers now start after today's updates (2.0.0~rc4-0ubuntu1). Also a new kernel was installed which may have fixed something too.

At any rate, looks like it's fixed for now.

Thanks!

Revision history for this message
Jesse Sung (wenchien) wrote :

Unprivileged LXC still fails to start here...

kernel: 4.4.0-9.24
lxc: 2.0.0~rc4-0ubuntu1

$ lxc-start -d -n i386-trusty-gui
lxc-start: lxc_start.c: main: 344 The container failed to start.
lxc-start: lxc_start.c: main: 346 To get more details, run the container in foreground mode.
lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options.

$ lxc-start -F -n i386-trusty-gui
lxc-start: cgfs.c: lxc_cgroupfs_create: 882 Could not set clone_children to 1 for cpuset hierarchy in parent cgroup.
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/memory/user.slice
lxc-start: cgfs.c: cgroup_rmdir: 208 Read-only file system - cgroup_rmdir: failed to delete /sys/fs/cgroup/perf_event/
lxc-start: cgfs.c: cgroup_rmdir: 208 Read-only file system - cgroup_rmdir: failed to delete /sys/fs/cgroup/net_cls,net_prio/
lxc-start: cgfs.c: cgroup_rmdir: 208 Read-only file system - cgroup_rmdir: failed to delete /sys/fs/cgroup/freezer/
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/cpu,cpuacct/user.slice
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/blkio/user.slice
lxc-start: cgfs.c: cgroup_rmdir: 208 Read-only file system - cgroup_rmdir: failed to delete /sys/fs/cgroup/cpuset/
lxc-start: cgfs.c: cgroup_rmdir: 208 Read-only file system - cgroup_rmdir: failed to delete /sys/fs/cgroup/hugetlb/
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/devices/user.slice
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/pids/user.slice/user-1000.slice/session-c2.scope
lxc-start: cgfs.c: cgroup_rmdir: 208 Permission denied - cgroup_rmdir: failed to delete /sys/fs/cgroup/systemd/user.slice/user-1000.slice/session-c2.scope
lxc-start: start.c: lxc_spawn: 1033 failed creating cgroups
lxc-start: start.c: __lxc_start: 1276 failed to spawn 'i386-trusty-gui'
lxc-start: lxc_start.c: main: 344 The container failed to start.
lxc-start: lxc_start.c: main: 348 Additional information can be obtained by setting the --logfile and --logpriority options.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 2.0.0~rc5-0ubuntu1

---------------
lxc (2.0.0~rc5-0ubuntu1) xenial; urgency=medium

  * New usptream release (2.0.0~rc5)
    - Fix a number of cgfs issues (LP: #1549363, LP: #1543697, LP: #1552355)
    - Fix attach failing to allocate a tty (LP: #1551960)
    - Fix LXC rebooting the container despite post-stop failure
    - Fix lxc-copy output (LP: #1551935)
    - Documentation, manpagen and manpage translations update
    - Update to the plamo template

 -- Stéphane Graber <email address hidden> Thu, 03 Mar 2016 11:05:25 -0500

Changed in lxc (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.