boot fails when a kernel filesystem can't be mounted (e.g., due to a dangling symlink)

Bug #1096079 reported by LaMont Jones
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

In 2.15.3 (lucid), and probably later, if I have this in fstab, then the boot locks up:
none /srv/chroots/raring-amd64/dev/shm tmpfs defaults 0 0

Looking at the target for the mount:
lrwxrwxrwx 1 root root 8 2012-11-07 23:05 /srv/chroots/raring-amd64/dev/shm -> /run/shm
On the lucid machine, /run/shm does not exist, of course.

Changing that to:
none /srv/chroots/raring-amd64/run/shm tmpfs defaults 0 0
works around the issue.

A dangling symlink in fstab should not cause a completely silent failure to boot.

Tags: lucid
Revision history for this message
Steve Langasek (vorlon) wrote :

Hi LaMont,

This problem is entirely specific to the fact that this is a tmpfs mount being affected. A tmpfs mount meets mountall's definition of a "virtual" filesystem, and mountall (wisely) does not encode policy about which mount points for virtual filesystems are required at boot or not because this changes over time. And until the virtual filesystems are mounted, nothing else on the system can start up - including the plymouth splash - so there's no sane way for mountall to give you an option to skip over this mount.

If this had been anything *other* than a virtual filesystem, you would certainly have been given an option to skip the missing mount. As it is, the missing mount point should probably be treated as a hard failure by mountall and cause the system to drop to a root shell. It would be helpful if you could confirm that this problem actually happens with later versions of mountall.

Other configurations that would have avoided this problem:
 - marking the mount 'nowait', so mountall knows not to block boot events waiting for it
 - mounting as part of the chroot setup rather than in the system /etc/fstab, which I think is more typical (though from context it appears this was a static chroot, so that's probably not an option here).

Changed in mountall (Ubuntu):
importance: Undecided → Low
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in mountall (Ubuntu):
status: New → Confirmed
Steve Langasek (vorlon)
Changed in mountall (Ubuntu):
importance: Low → Medium
Revision history for this message
Steve Langasek (vorlon) wrote :

Looking over this bug, I don't think there's anything we can do here to improve mountall's handling without compromising the guarantee of a consistent, race-free boot. We simply have no way of knowing that this particular tmpfs mount is not "required" for boot unless the admin marks it so in the /etc/fstab (with the 'nobootwait' option). And we can't reliably interact with the user on console until after the "virtual-filesystems" event is sent, because this is a prerequisite for udev, and udev needs to poke the hardware to make sure we wind up with the right console drivers. So the system is at an impasse.

I certainly agree that we don't want to leave the machine deadlocked with no output on the screen; but architecturally, I just don't see a way to address this corner case without introducing substantial bugs in more common scenarios.

Changed in mountall (Ubuntu):
status: Confirmed → Won't Fix
Revision history for this message
Steve Langasek (vorlon) wrote : Re: boot fails when a tmpfs can't be mounted due to a dangling symlink

I'm still going back and forth on this. There's also bug #1152274, which I'm marking as a duplicate. We really should do something better here, I'm just not sure what.

FWIW, printing messages to the console just got harder, because mountall has just been changed to log to /var/log/upstart instead of being bound to the console at all. Maybe we need to revisit that decision.

summary: - boot fails when a mount is a dangling symlink
+ boot fails when a tmpfs can't be mounted due to a dangling symlink
Changed in mountall (Ubuntu):
status: Won't Fix → Triaged
summary: - boot fails when a tmpfs can't be mounted due to a dangling symlink
+ boot fails when a kernel filesystem can't be mounted (e.g., due to a
+ dangling symlink)
Revision history for this message
Adam Conrad (adconrad) wrote :

Console logging this early in the boot should be trivial via /dev/kmsg, if nothing else? Maybe that's mildly abusive of the interface, but any attempt to hint at people why their boot is failing to be useful sure beats just sitting there.

Ken Sharp (kennybobs)
tags: added: lucid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.