Ubuntu
mountall package

NFS mounts at boot time prevent boot or print spurious errors

Bug #504224 reported by Alvin on 2010-01-07

120

This bug affects 22 people

Affects		Status	Importance	Assigned to	Milestone
	mountall (Ubuntu)	Fix Released	Medium	Unassigned	Ubuntu ubuntu-10.04
	Lucid	Fix Released	Medium	Unassigned	Ubuntu ubuntu-10.04

Bug Description

Binary package hint: mountall

karmic: mountall 1.0

When mounting NFS shares at boot, the first mount attempt usually fails since rpc or portmap are not running, or the network itself is not up yet.

These errors appear although the filesystem will be mounted successfully later on, so the user cannot distinguish them from "real" mount failures that need intervention by the admin.

Even worse, when the NFS mounts happen to be for a mountpoint that mountall considers essential for boot (like /home), the user is dropped to an emergency shell and system refuses to boot.

A temporary workaround is to add "nobootwait" for the affected mountpoints, but this causes the system to continue booting even if the mountpoint is not available when the network is set up correctly, i.e. if there is a real problem with the NFS server.

Proposed fix:

mountall should not try to mount any network file systems when called the first time. Only when the network is up and mountall receives SIGUSR1 it should try to do so. If this does not work at *that* point, mountall should print an error (and stop the boot for essential mount points) as before.

See original description

Tags:

Revision history for this message

Johan Walles (walles) wrote on 2010-01-07:

For clarification, if this is the same thing I'm having it's preventing my machine from booting.

I have about 10 NFS mounts, and my machine hangs on boot with the above printouts.

To work around this, and boot into X so I can log in, I can enter the recovery shell, do "mount -a", and exit the recovery shell.

This enables me to boot into X. I still cannot access my virtual text consoles after this; I assume that the boot process is still waiting for something else...

Revision history for this message

Alvin (alvind) wrote on 2010-01-07: Re: [Bug 504224] Re: Inconsistent error message: Filesystem could not be mounted

On Thursday 07 January 2010 14:47:33 Johan Walles wrote:
> For clarification, if this is the same thing I'm having it's preventing
> my machine from booting.

This bug is only about the error messages. They are the same and that makes it
difficult to know what is going on. (However, I do have a lot of machines that
suffer from the same problem as yours and can't boot.)

> I have about 10 NFS mounts, and my machine hangs on boot with the above
> printouts.
>
> To work around this, and boot into X so I can log in, I can enter the
> recovery shell, do "mount -a", and exit the recovery shell.

The same is the case on my servers. I think the problem you describe is caused
by bug #470776. The situation is not very consistent. Some machines simply
can't boot and others boot, but don't mount NFS. Still others (like the
example) do boot and mount the shares, but tell me it didn't work. (That's
what this bug is about.) There is also bug #431248 with some discussion, but
that one is considered fixed.

> This enables me to boot into X. I still cannot access my virtual text
> consoles after this; I assume that the boot process is still waiting for
> something else...

That might be something else, like a problem with the video driver.

Revision history for this message

Johan Walles (walles) wrote on 2010-01-07: Re: Inconsistent error message: Filesystem could not be mounted

I'm probably suffering from bug 470776, thanks for the reference! It has been fixed for Lucid, but I filed bug 504271 about getting it fixed for Karmic as well. Unattended boots would be nice.

My text consoles work "fine", they just don't have any login prompts. So it's probably not video driver related. Since the whole boot process has more or less broken down for me with Karmic, I tend to blame it for everything...

OT: I have also revived bug 466693 about the really slow boot process in Karmic.

Alvin (alvind) on 2010-01-14

tags:	added: ubuntu-boot-experience
tags:	added: boot-experience removed: ubuntu-boot-experience
tags:	added: ubuntu-boot-experience removed: boot-experience

Revision history for this message

Nikolaus Rath (nikratio) wrote on 2010-02-05:

Note that you can get around the recovery shell by using the "nobootwait" option it /etc/fstab for /home.

Changed in mountall (Ubuntu):
status:	New → Confirmed

Nikolaus Rath (nikratio) on 2010-02-05

description:	updated
summary:	- Inconsistent error message: Filesystem could not be mounted + NFS mounts at boot time prevent boot or print spurious errors

Revision history for this message

Alvin (alvind) wrote on 2010-02-09:

The undocumented 'nobootwait' option doesn't change anything here. It should be _netdev, (man 8 mount), because I'd like to wait until the network is up.

Revision history for this message

Paul Elliott (omahn) wrote on 2010-02-11:

I also have this issue on a test box upgraded from 8.04 LTS to 10.04 LTS alpha. Boot hangs completely, no X (not installed on our servers) and no login shell. VT1 shows the following:

mount.nfs: rpc.statd is not running but is required for remote locking.
mount.nfs: Either use '-o nolock' to keep locks local, or start statd.

Repeated several times followed by the following for each NFS entry (we have many) in our fstab:

mountall: mount /usr/systems [849] terminated with status 32
mountall: Filesystem could not be mounted: /usr/systems

Booting with the rescue option from the boot menu makes no difference. If I boot from a live CD and disable the NFS mounts in /etc/fstab then the server boots successfully.

Timo Aaltonen (tjaalton) on 2010-03-03

Changed in mountall (Ubuntu):
importance:	Undecided → Medium
milestone:	none → ubuntu-10.04

Scott James Remnant (Canonical) (canonical-scott) on 2010-03-31

Changed in mountall (Ubuntu Lucid):
status:	Confirmed → Fix Committed

Revision history for this message

Launchpad Janitor (janitor) wrote on 2010-03-31:

This bug was fixed in the package mountall - 2.10

---------------
mountall (2.10) lucid; urgency=low

  * Rework the Plymouth connection logic; one needs to attach the client to
    the event loop *after* connection otherwise you don't get disconnection
    notification, and one needs to actually actively disconnect in the
    disconnection handler.
  * For safety and sanity reasons it becomes much simpler to create the
    ply_boot_client when we connect, and free it on disconnection. Thus the
    presence or not of this struct tells us whether we're connected or not.
    LP: #524708.
  * Flush the plymouth connection before closing it and exiting, otherwise
    updates may be pending and the screen have messages that confuse people
    while X is starting (like fsck at 90%). LP: #487744.

  * Replace the modal plymouth prompt for error conditions with code that
    continues working in the background while prompting. This most benefits
    the old "Waiting for" message, which can now allow you to continue to
    wait and it can solve itself. LP: #527666, #545435.
  * Integrate fsck progress updates into the same mechanism.
  * Allow fsck messages to be translated. LP: #390740.
  * Change fsck message to be a little less alarming. LP: #545267.
  * Add hard dependency on Plymouth; without it running, mountall will
    ignore any filesystem which doesn't show up within a few seconds or that
    fails to fsck or mount. If you don't want graphical splash, you simply
    need not install themes.

  * Improve set of messages seen with --verbose, and ensure all visible
    messages are marked for translation. LP: #446592.
  * Reduce priority of failed to mount error for remote filesystems since
    we try again, and this just spams the console. LP: #504224.

  * Keep hold of the dev_t when parsing /proc/self/mountinfo, then after
    mounting /dev (or seeing that it's mounted) create a quick udev rules
    file that adds the /dev/root symlink to this device. LP: #527216.
  * Do not try and update /etc/mtab when it's a symbolic link. LP: #529993.
  * Remove odd -a option from mount calls, probably a C&P error from the
    fsck code long ago. LP: #537135.
  * Wait for Upstart to acknowledge receipt of events, even if we don't
    hang around for them to be handled.
  * Always run through try_mounts() at least once. LP: #537136.
  * Don't keep mountall running if the only remaining unmounted filesystems
  *
-- Scott James Remnant <email address hidden> Wed, 31 Mar 2010 19:37:31 +0100

This bug was fixed in the package mountall - 2.10

---------------
mountall (2.10) lucid; urgency=low

* Rework the Plymouth connection logic; one needs to attach the client to
    the event loop *after* connection otherwise you don't get disconnection
    notification, and one needs to actually actively disconnect in the
    disconnection handler.
  * For safety and sanity reasons it becomes much simpler to create the
    ply_boot_client when we connect, and free it on disconnection.  Thus the
    presence or not of this struct tells us whether we're connected or not.
    LP: #524708.
  * Flush the plymouth connection before closing it and exiting, otherwise
    updates may be pending and the screen have messages that confuse people
    while X is starting (like fsck at 90%).  LP: #487744.

* Replace the modal plymouth prompt for error conditions with code that
    continues working in the background while prompting.  This most benefits
    the old "Waiting for" message, which can now allow you to continue to
    wait and it can solve itself.  LP: #527666, #545435.
  * Integrate fsck progress updates into the same mechanism.
  * Allow fsck messages to be translated.  LP: #390740.
  * Change fsck message to be a little less alarming.  LP: #545267.
  * Add hard dependency on Plymouth; without it running, mountall will
    ignore any filesystem which doesn't show up within a few seconds or that
    fails to fsck or mount.  If you don't want graphical splash, you simply
    need not install themes.

* Improve set of messages seen with --verbose, and ensure all visible
    messages are marked for translation.  LP: #446592.
  * Reduce priority of failed to mount error for remote filesystems since
    we try again, and this just spams the console.  LP: #504224.

* Keep hold of the dev_t when parsing /proc/self/mountinfo, then after
    mounting /dev (or seeing that it's mounted) create a quick udev rules
    file that adds the /dev/root symlink to this device.  LP: #527216.
  * Do not try and update /etc/mtab when it's a symbolic link.  LP: #529993.
  * Remove odd -a option from mount calls, probably a C&P error from the
    fsck code long ago.  LP: #537135.
  * Wait for Upstart to acknowledge receipt of events, even if we don't
    hang around for them to be handled.
  * Always run through try_mounts() at least once.  LP: #537136.
  * Don't keep mountall running if the only remaining unmounted filesystems
  *
 -- Scott James Remnant <scott@ubuntu.com>   Wed, 31 Mar 2010 19:37:31 +0100

Changed in mountall (Ubuntu Lucid):
status:	Fix Committed → Fix Released

Revision history for this message

Paul McEnery (pmcenery) wrote on 2010-04-01:

img_0014.jpg Edit (780.0 KiB, image/jpeg)

I've built and installed version 2.10 and have been through a couple of reboot cycles and everything on the NFS front appears to be working correctly now.

On the first boot however, I noticed that the boot stalled while it was supposedly doing an fsck. To my surprise, I was able to ssh to the system while it was in this state, and found that all filesystems including nfs ones were in fact mounted. I ran a reboot, and it has since booted up correctly. I've attached a picture of where it got stuck.

I think there may be an issue with how routine disk checks are handled.

Regards,
Paul.

Revision history for this message

Brett Gardner (brett-gardner) wrote on 2010-05-01:

I am seeing this bug in Lucid as well.

Revision history for this message

Ethan Baldridge (ethan-superiordocumentservices) wrote on 2010-05-08:

#10

Yes, it's still happening in Lucid; also with CIFS volumes mounted in /etc/fstab. All pertinent mounts use the _netdev option, which should theoretically signal mountall not to do these until the network is available, but...

Most annoying is that about every other boot or so (probably doesn't happen every time due to timing issues in the parallel startup) it can't mount my CIFS volumes and gives me a recovery console.

My guess is that mountall isn't honoring _netdev.

(also mount.cifs throws an annoying warning message about not understanding _netdev, but that's just a papercut)

Revision history for this message

Steve Langasek (vorlon) wrote on 2010-05-09:

#11

_netdev is irrelevant on cifs and nfs shares. And mountall understands _netdev.

What, in your own words, is the problem you're seeing? I.e., don't say "it's still happening" - that doesn't tell us what you think "it" is, since in all our tests, there are no problems with CIFS/NFS mounts at boot time due to this bug.

Revision history for this message

paolo (paolo-faverio) wrote on 2010-05-10:

#12

HI all,
I'm Trying to make autofs running.
It seems suffer the same "rpc.statd is not running but is required for remote locking"
Forcing statd to run manually solve the issue.

Same Autofs configuration was working fine in 9.10.

Revision history for this message

Chris (bridgeriver) wrote on 2010-06-12:

#13

I also have these problems with nfs-mounted /home on Lucid. Sometimes GDM starts and lets me log in before /home is mounted, which leads to what amounts to a crash.

I made a partial workaround by hacking /etc/init/gdm.conf to make GDM wait on /home being mounted. This is an improvement but not a fix. Sometimes I'm left looking at a text screen for a minute or two before /home mounts and GDM starts; other times the wait is long and I have to log in, gain root, manually issue 'mount /home', and manually restart GDM.

All this works perfectly in Karmic; the problem is new to Lucid.

So can I just copy over Karmic's /etc/init directory to the Lucid install and have a working system?

Revision history for this message

Steve Langasek (vorlon) wrote on 2010-06-12: Re: [Bug 504224] Re: NFS mounts at boot time prevent boot or print spurious errors

#14

On Sat, Jun 12, 2010 at 07:32:23PM -0000, Chris wrote:
> I also have these problems with nfs-mounted /home on Lucid. Sometimes
> GDM starts and lets me log in before /home is mounted, which leads to
> what amounts to a crash.

That is unrelated to this bug, which has been fixed. If gdm is starting
before your /home is mounted, either you have a modified /etc/init/gdm.conf,
or you have something wrong in your /etc/fstab.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Revision history for this message

Chris (bridgeriver) wrote on 2010-06-14:

#15

@Steve Langasek:

My gdm.conf was not hacked until after the problem occurred, so that's presumably not it.

The relevant line in /etc/fstab is:

192.168.0.1:/home /home nfs rw,auto,intr,hard,bg,exec 0 2

This works fine under Karmic, but I don't have a huge amount of experience with NFS and it's possible there is some subtle error.

Revision history for this message

Chris (bridgeriver) wrote on 2010-06-14:

#16

@Steve Langasek:

Just to update: I tried changing the fstab lines to defaults:

192.168.0.1:/home /home nfs defaults 0 1

reverting the gdm.conf to remove the check for /home being mounted, and rebooting. This time the machine came up fine. Not sure if it'll do that reliably, but it looks promising.

Thanks!

Revision history for this message

graemev (graeme-launchpad) wrote on 2010-08-31:

#17

I'm a little concerned about the phrase "Only when the network is up"

I'm experiencing these hangs on my laptop (Acer Aspire One) when using Wireless but not when wired . I assume this is because the network is 'up' but the wireless doe snot come if an until I start a GUI ? (Actually I've not tracked down where the WiFi is started, but it seems to be user specific ...thats why I'm assuming a user needs to be logged on)

Revision history for this message

ubuntuforum-bisi (ubuntuforum-bisi) wrote on 2010-12-23:

#18

this appears to be happening in lucid 10.04.01

<code>
cat boot.log
Begin: Loading essential drivers... ...
Done.
Begin: Running /scripts/init-premount ...
Done.
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Done.
Begin: Running /scripts/local-premount ...
Done.
Begin: Running /scripts/local-bottom ...
Done.
Done.
Begin: Running /scripts/init-bottom ...
Done.
fsck from util-linux-ng 2.17.2
fsck from util-linux-ng 2.17.2
init: statd pre-start process (800) terminated with status 1
/dev/sda1: clean, 244437/14647296 files, 14616186/58564949 blocks
mount.nfs: DNS resolution failed for 192.168.xxx.3: Name or service not known
mount.nfs: DNS resolution failed for 192.168.xxx.8: Name or service not known
mountall: mount /1backup [866] terminated with status 32
mountall: mount /0data/elf [864] terminated with status 32
init: ureadahead-other main process (872) terminated with status 4
mount.nfs: DNS resolution failed for 192.168.xxx.8: Name or service not known
mountall: mount /0data/elf [890] terminated with status 32
mount.nfs: DNS resolution failed for 192.168.xxx.3: Name or service not known
mountall: mount /1backup [893] terminated with status 32
* Starting AppArmor profiles
Skipping profile in /etc/apparmor.d/disable: usr.bin.firefox
</code>

mount -a causes nfs mounts to occur as desired/expected

relevant contents of /etc/fstab
# nfs mount of elf's data
192.168.xxx.8:/0data /0data/elf nfs nfsvers=3,rw 0 0
# nfs mount of public folder on qnap
192.168.xxx.3:/Public /1backup nfs nfsvers=3,rw 0 0

Revision history for this message

astrostl (astrostl) wrote on 2011-04-18:

#19

I have also seen this on recent 10.04 LTS *VMs* (apparently tried to mount NFS prior to portmap running).

Revision history for this message

Sebastiaan Breedveld (s-breedveld) wrote on 2011-04-26:

#20

Can confirm this, 10.04 LTS suffers from this bug.

Revision history for this message

Damiön la Bagh (kat-amsterdam) wrote on 2011-07-30:

#21

Photo of the bug Edit (101.0 KiB, image/jpeg)

Proof of this bug in a screenshot. I have had my ubuntu customer in a panic now twice due to this matter. It happens when the machine has been turned off for the night. The next day he starts the machine (the nas runs 24/7 but goes into sleep mode), but the nas is still sleeping, so it doesn't react right away to the request to mount. The system hangs with the message in the screenshot.

Revision history for this message

Steve Langasek (vorlon) wrote on 2011-08-01:

#22

On Sat, Jul 30, 2011 at 03:21:33PM -0000, Kat Amsterdam wrote:
> Proof of this bug in a screenshot. I have had my ubuntu customer in a
> panic now twice due to this matter. It happens when the machine has been
> turned off for the night. The next day he starts the machine (the nas
> runs 24/7 but goes into sleep mode), but the nas is still sleeping, so
> it doesn't react right away to the request to mount. The system hangs
> with the message in the screenshot.

This is not the same bug as originally described, despite having
superficially similar symptoms. Please file a separate bug report for the
problem you're experiencing.

For me, the obvious first question is: why is plymouth exiting before
mountall has finished? Nothing in the /etc/init/plymouth-stop.conf job
should stop plymouth until the filesystem mounting is finished. So even if
the mount triggers too early the first time around, plymouth should still be
waiting for mountall-net to trigger and retry the network mount (and you
should never see the error message in normal operation).

The second question is why dbus is being killed with SIGTERM.

Is this screenshot taken after pressing Ctrl+Alt+Delete on the console?

susmita ghosh (surja-bi-das) on 2011-11-19

Changed in mountall (Ubuntu):
assignee:	nobody → susmita ghosh (surja-bi-das)

Steve Langasek (vorlon) on 2011-11-19

Changed in mountall (Ubuntu):
assignee:	susmita ghosh (surja-bi-das) → nobody

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntumountall package

NFS mounts at boot time prevent boot or print spurious errors

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
mountall package