With Lucid Lynx after upgrade or install mounts mounted with automounter daemon don't mount

Bug #571972 reported by James Sparenberg
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
am-utils (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: am-utils

1. This occurs both in upgrades from 9.10 and Fresh Installs.

2. Recreation method: Start am-utils

3. Significant error messages (similar too)

         Apr 29 17:42:02 temp6 amd[4285]/error: '/net': mount: No locks available

or
       Apr 29 17:42:03 temp6 amd[4283]/fatal: amfs_toplvl_mount: amfs_mount failed: No locks available

This occurs with configuration files that prior to the upgrade/install worked fine. This does however have a work around that for me at least has proven 100% effective. Adding this line to the [global] section of /etc/am-utils/amd.conf

  mount_type = autofs

Seems that this is actually an upstream kernel problem with kernels above 2.6.24, however I felt it would be best to report it for the purpose of tracking and finding a way to alert people on how to manage this problem. Additionally I've tested putting the line into the amd.conf of 9.10 systems and it has had no negative affect on performance.

Revision history for this message
Alex Sorokine (sorokine) wrote :

Installing am-utils on a new Lucid install causes the following error message after after starting am-utils:

root@ole:~# /etc/init.d/am-utils restart
Requesting amd unmount filesystems: Usage: amq [-fmpsvwHTU] [-h hostname] [-l log_file|"syslog"]
 [-x log_options] [-D debug_options]
 [-P program_number] [[-u] directory ...]
 done.
Stopping automounter: amd . done.
Starting automounter: amd.
root@ole:~# dmesg | tail -4
[ 2847.128634] svc: failed to register lockdv1 RPC service (errno 97).
[ 2847.128677] Invalid hostname "pid3207@ole:/net" in NFS lock request
[ 2848.128607] svc: failed to register lockdv1 RPC service (errno 97).
[ 2848.128649] Invalid hostname "pid3207@ole:/net" in NFS lock request

mounting nfs volumes through am utils does not work but works manually.

Revision history for this message
Phil Kaslo (phil) wrote :

I found that the line in the amd.conf file
   mount_type = autofs

 made the problem go away, for me, in ubuntu 9.10, but does not in 10.04. In 10.04,
 I now get the dmesg output

  [ 4027.926581] Invalid hostname "pid8529@lectura-ub:/net" in NFS lock request

as above, with or without 'mount_type = autofs'.

The the bug report discusssion at
    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=479884

indicates that it is believed that this is fixed in the am-utils source, 6.1.5.10, 11 Jun 2008.

If that is the case, i don't understand why we are still seeing the problem in ubuntu 10.04.
Did those changes get into the ubuntu version of am-utils?

Revision history for this message
Phil Kaslo (phil) wrote :

I found that the following change to the file libamu/mount_fs.c makes it work, for me:

# diff -c libamu/mount_fs.c.save libamu/mount_fs.c
*** libamu/mount_fs.c.save 2006-05-11 10:25:47.000000000 -0700
--- libamu/mount_fs.c 2010-05-18 17:41:13.000000000 -0700
***************
*** 528,534 ****
     * struct nfs_args, or truncate our concocted "hostname:/path"
     * string prematurely.
     */
! NFS_HN_DREF(nap->hostname, host_name);
  #ifdef MNT2_NFS_OPT_HOSTNAME
    nap->flags |= MNT2_NFS_OPT_HOSTNAME;
  #endif /* MNT2_NFS_OPT_HOSTNAME */
--- 528,534 ----
     * struct nfs_args, or truncate our concocted "hostname:/path"
     * string prematurely.
     */
! NFS_HN_DREF(nap->hostname, "localhost");
  #ifdef MNT2_NFS_OPT_HOSTNAME
    nap->flags |= MNT2_NFS_OPT_HOSTNAME;
  #endif /* MNT2_NFS_OPT_HOSTNAME */

--------

That is, change nap->hostname to "localhost" .

Phil

Revision history for this message
Gabriele Greco (gabrielegreco) wrote :

I've tried to apply the patch with a partial success, the patched version is able to mount the NFS filesystems BUT it's not able to go past listing the files, I still get this error:

[85349.668258] svc: failed to register lockdv1 RPC service (errno 97).

Then when I try to access the files I get:

ls -lrt /net/wyoming/tmp/
ls: cannot access /net/wyoming/tmp/testsuite06052010.tar: Input/output error
ls: cannot access /net/wyoming/tmp/RIEPILOGO PROGETTI.doc: Input/output error
ls: cannot access /net/wyoming/tmp/MCT2_2.15.3.run: Input/output error

I tried also with the patch on the debian bug tracking without any success... (same behaviour of the unpatched version)

Revision history for this message
Tim Cutts (timc) wrote : Re: [Bug 571972] Re: With Lucid Lynx after upgrade or install mounts mounted with automounter daemon don't mount

am-utils is getting harder and harder to keep working. The upstream
project is dead, to all intents and purposes. There hasn't been any
release from them for at least two years, possibly more. The code has
not kept up with changes in the Linux kernel lately. It got so
painful for us at work we switched to autofs instead. Not as feature-
rich, but more active as a project.

I'll have one more go at contacting upstream. If no timeline appears
for further development, I will probably orphan the debian packaging.

Tim

On 21 May 2010, at 12:24, Gabriele Greco <email address hidden>
wrote:

> I've tried to apply the patch with a partial success, the patched
> version is able to mount the NFS filesystems BUT it's not able to go
> past listing the files, I still get this error:
>
> [85349.668258] svc: failed to register lockdv1 RPC service (errno 97).
>
> Then when I try to access the files I get:
>
> ls -lrt /net/wyoming/tmp/
> ls: cannot access /net/wyoming/tmp/testsuite06052010.tar: Input/
> output error
> ls: cannot access /net/wyoming/tmp/RIEPILOGO PROGETTI.doc: Input/
> output error
> ls: cannot access /net/wyoming/tmp/MCT2_2.15.3.run: Input/output error
>
> I tried also with the patch on the debian bug tracking without any
> success... (same behaviour of the unpatched version)
>
> --
> With Lucid Lynx after upgrade or install mounts mounted with
> automounter daemon don't mount
> https://bugs.launchpad.net/bugs/571972
> You received this bug notification because you are subscribed to am-
> utils in ubuntu.
>
> Status in “am-utils” package in Ubuntu: New
>
> Bug description:
> Binary package hint: am-utils
>
> 1. This occurs both in upgrades from 9.10 and Fresh Installs.
>
> 2. Recreation method: Start am-utils
>
> 3. Significant error messages (similar too)
>
> Apr 29 17:42:02 temp6 amd[4285]/error: '/net': mount: No
> locks available
>
> or
> Apr 29 17:42:03 temp6 amd[4283]/fatal: amfs_toplvl_mount:
> amfs_mount failed: No locks available
>
>
> This occurs with configuration files that prior to the upgrade/
> install worked fine. This does however have a work around that for
> me at least has proven 100% effective. Adding this line to the
> [global] section of /etc/am-utils/amd.conf
>
> mount_type = autofs
>
> Seems that this is actually an upstream kernel problem with kernels
> above 2.6.24, however I felt it would be best to report it for the
> purpose of tracking and finding a way to alert people on how to
> manage this problem. Additionally I've tested putting the line into
> the amd.conf of 9.10 systems and it has had no negative affect on
> performance.
>
>

Revision history for this message
Phil Kaslo (phil) wrote :

We had been using amd instead of autofs on a large server, with possibly 100 - 200 concurrent login sessions.
With autofs, each home directory for a logged in user would be a separate mount point. We found that things
started to get ugly, when the number of mount points grew to a number somewhat less than 256, maybe 200
to 220. Using instead static mounts from the file servers, and then using amd to manage symlinks, via map
entries for /home of the form
     username fs:=/directory/path;type:=link
keeps the number of nfs mounts down to the number of static mounts.

We have been using autofs on all our other linux boxes (Ubuntu and Fedora). But due to this issue
we had been using amd on our main instructional server, now running Ubuntu.

Have other people had success with using autofs for /home, with large numbers of concurrent logins?
With more than say 256 nfs mount points?

Phil

Revision history for this message
Jere Frost (jere-frost) wrote :

FYI.....

We are also seeing a problem with AMD getting the error ""fatal: amfs_toplvl_mount: amfs_mount failed: No locks available" when using version 6.1.5-12ubuntu2 of packages am-utils and libamu4 with Ubuntu 10.4. We are trying to transition all of our RHEL and SuSE systems (~500) to Ubuntu Lucid (10.4) and would like to keep using AMD instead of having to convert to autofs.

Revision history for this message
Tim Cutts (timc) wrote :

On 24 May 2010, at 4:08 am, Phil Kaslo wrote:

>
> We had been using amd instead of autofs on a large server, with possibly 100 - 200 concurrent login sessions.
> With autofs, each home directory for a logged in user would be a separate mount point. We found that things
> started to get ugly, when the number of mount points grew to a number somewhat less than 256, maybe 200
> to 220. Using instead static mounts from the file servers, and then using amd to manage symlinks, via map
> entries for /home of the form
> username fs:=/directory/path;type:=link
> keeps the number of nfs mounts down to the number of static mounts.
>
> We have been using autofs on all our other linux boxes (Ubuntu and Fedora). But due to this issue
> we had been using amd on our main instructional server, now running Ubuntu.
>
> Have other people had success with using autofs for /home, with large numbers of concurrent logins?
> With more than say 256 nfs mount points?

We avoided the issue by having users home directories being say, /home/t/tjrc, with the automount and the t level so there are only 26 mountpoints, regardless of the number of logins.

We saw a similar thing to you with our automounted /software directory, and used a similar solution to you; we statically mounted the fileserver elsewhere, and used autofs to manage symlinks to it. This was not due to number of filesystems mounted, but more because it copes better when the number of simultaneous mount requests gets large (such as when we reboot the entire 500 node compute cluster)

Tim

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

I wish to retract my statement about 100% effective on adding the line mount_type = autofs as I've run into two upgrade systems that did not work right, so much so that it eventually borked the install on one. (No problem thanks to amd I've got disposable installs however ........ ) ;)

Has anyone tried pining the am-utils version in 10.04 back to the version from 9.10 or 9.04? I'm wondering if that will give a clue as to what changed. I'm planning on doing some experimentation in this direction I'll report what I find.

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

Found this bug at Mandriva https://qa.mandriva.com/show_bug.cgi?id=45007

It seems to be a post linux 2.6.25 bug

Revision history for this message
Tim Cutts (timc) wrote :

On 25 May 2010, at 12:49 am, James Sparenberg wrote:

> I wish to retract my statement about 100% effective on adding the line
> mount_type = autofs as I've run into two upgrade systems that did not
> work right, so much so that it eventually borked the install on one.
> (No problem thanks to amd I've got disposable installs however ........
> ) ;)
>
> Has anyone tried pining the am-utils version in 10.04 back to the
> version from 9.10 or 9.04? I'm wondering if that will give a clue as to
> what changed. I'm planning on doing some experimentation in this
> direction I'll report what I find.

I suspect you'll find it's not that am-utils has changed (the version of the package will probably be the same, at least at the upstream version, 6.1.5, which came out in 2006 or so) but that the Linux kernel has changed, and am-utils has failed to keep up with it. I've been talking on the am-utils mailing list over the past 24 hours, and there is still one maintainer of the upstream package. There hasn't been a release for almost five years now, but there have been some changes going on in a pre-release capacity, and I'm going to attempt building a package from the current git sources at some point in the next couple of weeks, if I find the time. Keep an eye out for an upload to Debian sid (I don't target Ubuntu directly)

Tim

Revision history for this message
Tim Cutts (timc) wrote :

Note that switching to mount_type = autofs is not a transparent change to your users, because it causes the mounts to happen in place rather than as symlinks to the real mount location elsewhere, as it does in the traditional mount_type = nfs. This means that the output of some routines which determine the current working directory will change.

mount_type=autofs actually is rather more sensible at this than the old system, but nevertheless there will be a change in behaviour which might affect your users or applications.

Tim

Revision history for this message
robert@smithpierce.net (robert-smithpierce) wrote :

I believe this bug has to do with a am-utils configuration problem on ubuntu lucid looking for linux/nfs_mount.h. Looking at the fedora lists, it appears that fc13 may have a similar issue. The problem appears to be that configure cannot compile its test program for linux/nfs_mount.h, so that it configures as if the file were missing (verify by grepping configure output for nfs_mount). Oddly, this does not break the build, but silently produces a broken amd.

The fix that has (so far) worked for me is to add #include<bits/sockaddr.h> to the configure test. This seems to result in a clean configure and build. There is no need to use the autofs hack, and this resolves both the "missing lock" and the "Input/Output Error" problems that have been reported.

The following patch does the trick. It should probably put a proper test around the new include statement. Note that this is applied to the standard am-utils-6.1.5 release sources after application of the http://archive.ubuntu.com/ubuntu/pool/universe/a/am-utils/am-utils_6.1.5-12ubuntu2.diff.gz patch and three of the four patches created in debian/patches. (The mk-amd-map_tmpfile.patch appears to have been applied in the primary ubuntu patch.)

_____________

diff -Naur am-utils-6.1.5.orig/configure am-utils-6.1.5/configure
--- am-utils-6.1.5.orig/configure 2010-06-30 19:49:47.000000000 +0000
+++ am-utils-6.1.5/configure 2010-06-30 19:52:10.000000000 +0000
@@ -25700,6 +25700,7 @@
 # ifndef __KERNEL__
 # define __KERNEL__
 # endif /* __KERNEL__ */
+#include<bits/sockaddr.h>
 #ifdef HAVE_LINUX_SOCKET_H
 # include <linux/socket.h>
 #endif /* HAVE_LINUX_SOCKET_H */
diff -Naur am-utils-6.1.5.orig/configure.in am-utils-6.1.5/configure.in
--- am-utils-6.1.5.orig/configure.in 2010-06-30 19:49:47.000000000 +0000
+++ am-utils-6.1.5/configure.in 2010-06-30 19:52:20.000000000 +0000
@@ -620,6 +620,7 @@
 # ifndef __KERNEL__
 # define __KERNEL__
 # endif /* __KERNEL__ */
+#include<bits/sockaddr.h>
 #ifdef HAVE_LINUX_SOCKET_H
 # include <linux/socket.h>
 #endif /* HAVE_LINUX_SOCKET_H */

______________

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :
Download full text (6.1 KiB)

I've attempted to build applying this patch. what I get is below. This seems to be related to Debian bug
#427260
also connected to am-utils

onfigure: *** INITIALIZATION ***
configure: *** SYSTEM TYPES ***
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking host cpu... x86_64
checking vendor... unknown
checking host full OS name and version... linux
checking host OS name... linux
checking host OS version... 2.6.32-25-generic
checking host OS architecture... x86_64
checking OS system distribution... deb
checking host name... james-pc
checking user name... james
checking configuration date... Fri Oct 15 12:30:19 PDT 2010
configure: *** PACKAGE NAME AND VERSION ***
checking package name... "am-utils"
checking version of package... "6.1.5"
checking bug-reporting address... "https://bugzilla.am-utils.org/ or <email address hidden>"
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
configure: *** PARTICULAR PROGRAMS (part 1) ***
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ANSI C... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking how to run the C preprocessor... gcc -E
checking whether gcc and cc understand -c and -o together... yes
checking for egrep... grep -E
checking whether gcc needs -traditional... no
checking whether make sets $(MAKE)... (cached) yes
checking for library containing strerror... none required
checking for AIX... no
configure: *** OPTION PROCESSING ***
checking if ldap is wanted... yes, will enable if all libraries are found
checking if hesiod is wanted... yes, will enable if all libraries are found
checking if ndbm is wanted... yes, will enable if all libraries are found
checking for debugging options... no
checking for configuration/compilation (-I) preprocessor flags... none
checking for configuration/compilation (-l) library flags... none
checking for configuration/compilation (-L) library flags... none
checking for additional C option compilation flags... none
checking a local configuration file... yes
checking whether to enable maintainer-specific portions of Makefiles... no
configure: *** LIBTOOL ***
checking for a sed that does not truncate output... /bin/sed
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for /usr/bin/ld option to reload object files... -r
checking for BSD-compatible nm... /usr/bin/nm -B
checking whether ln -s works... yes
checking how to recognise dependent libraries... pass_all
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory...

Read more...

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

Looking into it further It seems that UTS_RELEASE is now kept in a file called utsrelease.h not in version.h. so I copied this file sudo cp /usr/src/linux-headers-2.6.25-2-386/include/linux/utsrelease.h into /usr/include linux and made the following mod to configure.

*** configure.orig 2010-10-15 13:04:48.000000000 -0700
--- configure 2010-10-15 13:05:43.000000000 -0700
***************
*** 20953,20959 ****

  #include <stdio.h>
! #include <linux/version.h>

  main(argc)
  int argc;
--- 20953,20959 ----

  #include <stdio.h>
! #include <linux/utsrelease.h>

  main(argc)
  int argc;

I can now configure and make starts however I run into this later on.

conf_parse.o: In function `fprintf':
/usr/include/bits/stdio2.h:98: undefined reference to `ayylineno'
conf_parse.o: In function `yyparse':
/home/james/am-utils/am-utils-6.1.5/amd/conf_parse.c:1294: undefined reference to `yylex'
get_args.o: In function `get_args':
/home/james/am-utils/am-utils-6.1.5/amd/get_args.c:334: undefined reference to `yyin'
collect2: ld returned 1 exit status
make[2]: *** [amd] Error 1
make[2]: Leaving directory `/home/james/am-utils/am-utils-6.1.5/amd'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/james/am-utils/am-utils-6.1.5'
make: *** [all] Error 2

Which would indicate that even though flex is installed something is missing.

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

It seems the guys with am-utils have the bug pegged. #612 https://bugzilla.am-utils.org/show_bug.cgi?id=612

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

I've found a work around. Install am-utils and libamu4 from debian testing. So far for me on 10.4LTS this has worked without fail for me.

Revision history for this message
Tim Cutts (timc) wrote :

On 27 Apr 2011, at 21:33, James Sparenberg wrote:

> I've found a work around. Install am-utils and libamu4 from debian
> testing. So far for me on 10.4LTS this has worked without fail for me.

That makes sense. You need the more recent version for it to work properly with recentish kernels (I think the locking problem emerged with 2.6.26)

Regards,

Tim

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

Tim,

   First thanks, second will this be rolled into 10.4 LTS?

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

Also, if I may ask. Since this bug affect myself and 7 others, and has a known work around. How do we get this marked confirmed?

Tim Cutts (timc)
Changed in am-utils (Ubuntu):
status: New → Confirmed
Revision history for this message
Tim Cutts (timc) wrote :

I've marked it as confirmed, but I'm not an Ubuntu maintainer, only the maintainer of the slightly-upstream Debian package, so I don't know what the procedures are for updating the real distribution. I'm slightly surprised the LTS version (6.1.5-12ubuntu1) isn't working - I think it might because LTS now uses kernels which are even newer than were current in Lenny when I initially patched this back in 6.1.5-10. While the Debian Testing package might well work for you, I suspect that the Ubuntu maintainers might not want to use it, at least in LTS, for two reasons:

1) It's an upstream version change, not just a tweak to 6.1.5
2) The upstream I'm now using is a git checkout; 6.2 still hasn't been formally released, so I'm having to base my packages off from the git trunk, which doesn't seem like something that's suitable for a 'Long Term Support' release!

I suppose one possibility is to create an am-utils PPA, and then those people that want to use the bleeding edge Debian version can use that if they want to risk running around with their pants on fire. :-) I'll look into that, although I don't know whether it's something I will do - I am on the verge of orphaning the am-utils package. I don't use it any more at work, and I'm going to become a father some time in the next two weeks, so my time will be limited!

Regards,

Tim

Revision history for this message
Mark Munro (munro) wrote :

My current fix for this is to actually install the version from Karmic (older version works with current kernel and newer version breaks ?)
So have added the karmic repos to sources.list and pinned the am-utils version to 6.1.5-12ubuntu1
also need libamu4 to be pinned as well (to 6.1.5-12ubuntu1 again)

Seems that this worked on my lucid alpha install and was then broken for the LTS release.

Have been living in hope that there would be a 6.1.5-12ubuntu3 that would fix this again.

Cheers
Mark

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.