mountall loops on pipe2() while mounting /sys/kernel/debug when running on Xen

Bug #469985 reported by William Pitcock
38
This bug affects 8 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Fix Released
High
Scott James Remnant (Canonical)
Declined for Karmic by Scott James Remnant (Canonical)
Declined for Lucid by Scott James Remnant (Canonical)

Bug Description

Binary package hint: mountall

Ubuntu 9.10, which introduces the mountall utility, is unable to boot under a Debian 2.6.26 Xen domU kernel.

When the boot process is edited to show us what mountall is doing, it becomes stuck on "mounting /sys/kernel/debug".

Here is the full boot log:

track_usplash: will not start usplash
parse_filesystems: reading filesystems
parse_filesystems: sysfs (nodev)
parse_filesystems: rootfs (nodev)
parse_filesystems: bdev (nodev)
parse_filesystems: proc (nodev)
parse_filesystems: cgroup (nodev)
parse_filesystems: cpuset (nodev)
parse_filesystems: debugfs (nodev)
parse_filesystems: securityfs (nodev)
parse_filesystems: sockfs (nodev)
parse_filesystems: usbfs (nodev)
parse_filesystems: pipefs (nodev)
parse_filesystems: anon_inodefs (nodev)
parse_filesystems: tmpfs (nodev)
parse_filesystems: inotifyfs (nodev)
parse_filesystems: devpts (nodev)
parse_filesystems: ramfs (nodev)
parse_filesystems: mqueue (nodev)
parse_filesystems: ext3
new_mount: /: /dev/root rootfs defaults check
new_mount: /proc: - proc nodev,noexec,nosuid
new_mount: /proc/sys/fs/binfmt_misc: - binfmt_misc nodev,noexec,nosuid,optional
new_mount: /sys: - sysfs nodev,noexec,nosuid
new_mount: /sys/fs/fuse/connections: - fusectl optional
new_mount: /sys/kernel/debug: - debugfs optional
new_mount: /sys/kernel/security: - securityfs optional
new_mount: /spu: - spufs gid=spu,optional
new_mount: /dev: - tmpfs mode=0755 hook
new_mount: /dev/pts: - devpts noexec,nosuid,gid=tty,mode=0620
new_mount: /dev/shm: - tmpfs nosuid,nodev
new_mount: /tmp: - - - hook
new_mount: /var/run: - tmpfs mode=0755,nosuid,showthrough hook
new_mount: /var/lock: - tmpfs nodev,noexec,nosuid,showthrough
new_mount: /lib/init/rw: - tmpfs mode=0755,nosuid,optional
parse_fstab: updating mounts
update_mount: /proc: proc proc defaults
new_mount: none: /dev/sda1 swap sw
update_mount: /: /dev/sda2 ext3 noatime,nodiratime,errors=remount-ro,relatime check
parse_mountinfo_file: updating mounts
update_mount: /sys: - sysfs nodev,noexec,nosuid
update_mount: /proc: proc proc defaults
update_mount: /dev: udev tmpfs mode=0755 hook
update_mount: /: /dev/sda2 ext3 noatime,nodiratime,errors=remount-ro,relatime check
new_mount: /dev/.static/dev: /dev/sda2 ext3 -
mount_policy: /proc/sys/fs/binfmt_misc: dropping unknown filesystem
mount_policy: /sys/fs/fuse/connections: dropping unknown filesystem
mount_policy: /spu: dropping unknown filesystem
mount_policy: / is local (root)
mount_policy: /proc can be mounted while root readonly
mount_policy: /proc prior fstab entry /
mount_policy: /proc is virtual
mount_policy: /sys can be mounted while root readonly
mount_policy: /sys is virtual
mount_policy: /sys/kernel/debug parent is /sys
mount_policy: /sys/kernel/debug is virtual
mount_policy: /sys/kernel/security parent is /sys
mount_policy: /sys/kernel/security is virtual
mount_policy: /dev can be mounted while root readonly
mount_policy: /dev is virtual
mount_policy: /dev/pts parent is /dev
mount_policy: /dev/pts is virtual
mount_policy: /dev/shm parent is /dev
mount_policy: /dev/shm is virtual
mount_policy: /tmp parent is /
mount_policy: /tmp is local
mount_policy: /var/run can be mounted while root readonly
mount_policy: /var/run is virtual
mount_policy: /var/lock can be mounted while root readonly
mount_policy: /var/lock is virtual
mount_policy: /lib/init/rw can be mounted while root readonly
mount_policy: /lib/init/rw is virtual
mount_policy: /dev/sda1 is swap
mount_policy: /dev/.static/dev parent is /dev
mount_policy: /dev/.static/dev is other (default)
mounted: /proc
mounted: local 0/2 remote 0/0 virtual 1/10 swap 0/1
mounted: /sys
mounted: local 0/2 remote 0/0 virtual 2/10 swap 0/1
mounted: /dev
dev_hook: populating /dev
mounted: local 0/2 remote 0/0 virtual 3/10 swap 0/1
mounted: /dev/.static/dev
mounted: local 0/2 remote 0/0 virtual 3/10 swap 0/1
try_mount: / waiting for device /dev/sda2
queue_fsck: /sys/kernel/debug: no check required
mounting /sys/kernel/debug

Please fix this as soon as possible before more people who run Ubuntu Server on Xen virtual machines shoot themselves in the foot. This bug is a critical issue.

Revision history for this message
William Pitcock (nenolod) wrote :

It should also be mentioned that this bug causes the Ubuntu 9.10 VM to use 100% CPU until it is stopped, which is just annoying.

Revision history for this message
srb (stock) wrote :

I'm running into a similar issue with Xen using 9.10 as a domU with kernel version 2.6.18. Adding a note about this to https://help.ubuntu.com/community/KarmicUpgrades would be hugely helpful to Xen users who have not yet upgraded and are unaware of mountall's kernel requirements.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Which process is consuming the CPU, is it mountall or mount? Please strace the process and provide the output so we can see what it's doing

Changed in mountall (Ubuntu):
status: New → Incomplete
importance: Undecided → High
Revision history for this message
Berteun (berteun) wrote :

I have the same behaviour with a Ubuntu domU under a Debian Lenny system, the kernel used is Debian's 2.6.26-2-xen-amd64. I provide a strace file of mountall as called by mountall.conf – the final line just keeps repeating. The whole strace file was over 100 MB but the last lines were all the same. So it seems stuck in a loop there.

Revision history for this message
Berteun (berteun) wrote :

Long story short: It won't work directly, the Debian kernel is too old. mountall uses pipe2 (http://www.kernel.org/doc/man-pages/online/pages/man2/pipe.2.html) when mounting. Therefore in mountall.c in the function 'spawn' the following line: NIH_ZERO (pipe2 (fds, O_CLOEXEC)); causes an infinite loop since pipe2 is not available in 2.6.26. A newer kernel is required to make this work.

The non-working syscall hence is pipe2.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Right, there should be no reason to loop on pipe2() like that - this should fail a bit more gracefully. (I'm considering the bug that it loops, not that it fails with the older kernel)

summary: - mountall gets stuck in a busywait while mounting /sys/kernel/debug when
- running on Xen
+ mountall loops on pipe2() while mounting /sys/kernel/debug when running
+ on Xen
Changed in mountall (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Unlikely to backport this to karmic, so declined. Also no reason to make lucid release critical, this will be long fixed before any beta

Changed in mountall (Ubuntu):
status: Triaged → Fix Committed
assignee: nobody → Scott James Remnant (scott)
Changed in mountall (Ubuntu):
milestone: none → lucid-alpha-2
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.0

---------------
mountall (2.0) lucid; urgency=low

  [ Scott James Remnant ]
  * "mount" event changed to "mounting", to make it clear it happens
    before the filesystem is mounted. Added "mounted" event which
    happens afterwards.
  * Dropped the internal hooks, these are now better handled by Upstart
    jobs on the "mounted" event.
  * Dropped the call to restorecon for tmpfs filesystems, this can also be
    handled by an Upstart job supplied by SELinux now.
    - mounted-dev.conf replaces /dev hook, uses MAKEDEV to make devices.
    - mounted-varrun.conf replaces /var/run hook
    - mounted-tmp.conf replaces /tmp hook.
      + Hook will be run for any /tmp mountpoint. LP: #478392.
      + Switching back to using "find" fixes $TMPTIME to be in days again,
        rathern than hours. LP: #482602
  * Try and make mountpoints, though we only care about failure if the
    mountpoint is marked "optional" since otherwise the filesystem might
    make the mountpoint or something.
  * Rather than hiding the built-in mountpoints inside the code, put them
    in a new /lib/init/fstab file; that way users can copy the lines into
    /etc/fstab if they wish to override them in some interesting way.
  * Now supports multiple filesystem types listed in fstab, the whole
    comma-separated list is passed to mount and then /proc/self/mountinfo
    is reparsed to find out what mount actually did.
    * /dev will be mounted as a devtmpfs filesystem if supported by the
      kernel (which then does not need to run the /dev hook script).
  * Filesystem checks may be forced by adding force-fsck to the kernel
    command-line.
  * Exit gracefully with an error on failed system calls, don't infinite
    loop over them. LP: #469985.
  * Use plymouth for all user communication, replacing existing usplash and
    console code;
    * When plymouth is running, rather than exiting on failures, prompt the
      user as to whether to fix the problem (if possible), ignore the problem,
      ignore the mountpoint or drop to a maintenance shell. LP: #489474.
    * If plymouth is not running for whatever reason, the fallback action
      is always to start the recovery shell.
  * Adjust the set of filesystems that we wait for by default: LP: #484234.
    * Wait for all local filesystems, except those marked with the
      "nobootwait" option.
    * Wait for remote filesystems mounted as, or under, /usr or /var, and
      those marked with the "bootwait" option.
  * Always try network mount points, since we allow them to fail silently;
    SIGUSR1 now simply retries them once more. LP: #470776.
  * Don't retry devices repeatedly. LP: #480564.
  * Added manual pages for the events emitted by this tool.

  [ Johan Kiviniemi ]
  * Start all fsck instances in parallel, but set their priorities so that
    thrashing is avoided. LP: #491389.
 -- Scott James Remnant <email address hidden> Mon, 21 Dec 2009 23:09:23 +0000

Changed in mountall (Ubuntu):
status: Fix Committed → Fix Released
Revision history for this message
Alex Mitchell (alexm-nus) wrote :

Any suggestions as to what people such as myself should do who just upgraded to 9.10 without knowing about this gotcha, and now can't boot? I can get my server up manually, but that's a temporary fix. Waiting until April for Lucid seems a bit unreasonable, and I assume trying to downgrade to 9.04 would be messy.

Also, shouldn't this problem be added here https://help.ubuntu.com/community/KarmicUpgrades as srb suggested? Would have saved me a lot of hassle... :(

Any suggestions would be greatly appreciated!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.