HAL stats nfs and autofs mounts, preventing autofs timeouts from working

Bug #55223 reported by Jason McMullan
4
Affects Status Importance Assigned to Milestone
hal (Ubuntu)
Fix Released
Low
Martin Pitt

Bug Description

(Cribbed from http://www.redhat.com/archives/fedora-cvs-commits/2006-May/msg00994.html)
Tested and works perfectly.

Update of /cvs/dist/rpms/hal/devel
In directory cvs.devel.redhat.com:/tmp/cvs-serv21556

Modified Files:
 hal.spec
Added Files:
 hal-0.5.7-fix-for-nfs-and-autofs.patch
Log Message:
- Add patch that makes hald not stat nfs and autofs mounts

hal-0.5.7-fix-for-nfs-and-autofs.patch:
 blockdev.c | 15 +++++++++++++++
 1 files changed, 15 insertions(+)

--- NEW FILE hal-0.5.7-fix-for-nfs-and-autofs.patch ---
--- hal-0.5.7/hald/linux2/blockdev.c.fix_for_nfs_and_autofs 2006-05-17 13:50:00.000000000 -0400
+++ hal-0.5.7/hald/linux2/blockdev.c 2006-05-17 14:01:32.000000000 -0400
@@ -205,6 +205,21 @@
  while ((mnte = getmntent_r (f, &mnt, buf, sizeof(buf))) != NULL) {
   struct stat statbuf;

+ /* If this is a nfs mount or autofs
+ * (fstype == 'nfs' || fstype == 'autofs')
+ * ignore the mount. Reason:
+ * 1. we don't list nfs devices in HAL
+ * 2. more problematic: stat on mountpoints with
+ * 'stale nfs handle' never come
+ * back and block complete HAL and all applications
+ * using HAL fail.
+ * 3. autofs and HAL butt heads causing drives to never
+ * be unmounted
+ */
+ if (strcmp(mnt.mnt_type, "nfs") == 0 ||
+ strcmp(mnt.mnt_type, "autofs") == 0)
+ continue;
+
   /* check the underlying device of the mount point */
   if (stat (mnt.mnt_dir, &statbuf) != 0)
    continue;

Revision history for this message
Jan Groenewald (jan-aims) wrote :

I can confirm this on a recently installed edgy.

/home was mounted nfs/autofs async and didn't have a problem.
/var/mail was mounted nfs/autofs sync, and opening more than one mutt, or waiting for the timeout, or something made the processes (a couple of mutts and a cat /var/mail/me) accessing /var/mail freeze. Even going to runlevel 1, I still couldn't kill them!

Now I changed the file in the hal source 00upstream-05-fix-stale_nfs_handle_block.patch (in debian/patches)
which already handled the nfs part of the patch. I added the autofs part. And rebuilt and installed the new hal.

As far as I can tell it is solved now. I've opened several mutts, and made different changes to the inbox in different instances, and they are not complaining, but updating correctly.

Revision history for this message
Jan Groenewald (jan-aims) wrote :

I spoke to soon.

Well, perhaps my problem was unrelated. /var/mail on autofs/nfs (server is sarge) was running fine on breezy and dapper. Now, after some minutes:

Nov 15 13:56:09 dikkop kernel: [17182078.364000] CPU: 0
Nov 15 13:56:09 dikkop kernel: [17182078.364000] EIP: 0060:[kmap_atomic+17/128] Tainted: P VLI
Nov 15 13:56:09 dikkop kernel: [17182078.364000] EFLAGS: 00010202 (2.6.17-10-generic #2)
Nov 15 13:56:09 dikkop kernel: [17182078.364000] EIP is at kmap_atomic+0x11/0x80
Nov 15 13:56:09 dikkop kernel: [17182078.364000] eax: 00000d63 ebx: 00000d63 ecx: e2486000 edx: 00000003
Nov 15 13:56:09 dikkop kernel: [17182078.364000] esi: 00000003 edi: 00000cea ebp: e2f773d0 esp: e2487d7c
Nov 15 13:56:09 dikkop kernel: [17182078.364000] ds: 007b es: 007b ss: 0068
Nov 15 13:56:09 dikkop kernel: [17182078.364000] Process mutt (pid: 5747, threadinfo=e2486000 task=e211d030)
Nov 15 13:56:09 dikkop kernel: [17182078.364000] Stack: 0000018b 00000316 f8f13ca8 00000e75 df8cb100 00001000 e2f773d0 f8f14c42
Nov 15 13:56:09 dikkop kernel: [17182078.364000] 00000004 c036f7ac 00000206 00000001 c17f7a80 00000040 00000001 c036f7a8
Nov 15 13:56:09 dikkop kernel: [17182078.364000] e211d030 000201d2 00000000 00000000 e2f7747c f1d9f640 00000000 e2f77478
Nov 15 13:56:09 dikkop kernel: [17182078.364000] Call Trace:
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <f8f13ca8> nfs_readpage_truncate_uninitialised_page+0xe8/0x130 [nfs] <f8f14c42> nfs_readpage+0x272/0x4f0 [nfs]
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c014c698> do_generic_mapping_read+0x508/0x590 <c014d118> __generic_file_aio_read+0xf8/0x270
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c014b8a0> file_read_actor+0x0/0xf0 <f8f0e104> nfs_revalidate_mapping+0x44/0x160 [nfs]
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c014d2d5> generic_file_aio_read+0x45/0x60 <c016a5a4> do_sync_read+0xc4/0x100
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c0136180> autoremove_wake_function+0x0/0x50 <c022684b> tty_write+0x1ab/0x1f0
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c016b05c> vfs_read+0xbc/0x180 <c016a4e0> do_sync_read+0x0/0x100
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <c016b5d1> sys_read+0x41/0x70 <c0102fbb> sysenter_past_esp+0x54/0x79
Nov 15 13:56:09 dikkop kernel: [17182078.364000] Code: e0 05 03 05 00 cc 47 c0 c3 0f 0b 98 00 6b 30 2f c0 eb d4 8d b4 26 00 00 00 00 56 b9 00 e0 ff ff 53 89 c3 21 e1 83 41 14 01 89 d6 <8b> 00 c1 e8 1e 8b 14 85 84 90 3e c0 8b 82 0c 06 00 00 05 80 13
Nov 15 13:56:09 dikkop kernel: [17182078.364000] EIP: [kmap_atomic+17/128] kmap_atomic+0x11/0x80 SS:ESP 0068:e2487d7c
Nov 15 13:56:09 dikkop kernel: [17182078.364000] <6>note: mutt[5747] exited with preempt_count 1

Whatever that means :-P

Revision history for this message
Jason McMullan (jason-mcmullan) wrote :

I do believe your test case is not testing what the patch solves.

The problem I was trying to solve was that after autofs mounted a location, hald would stat that mount
on a regular basis, preventing the autofs auto-unmount timeouts from expiring and unmounting the
mount.

A proper test would be:

$ df | grep '/net/some.nfs.server/foo'
(expected: nothing)
$ ls -l /net/some.nfs.server/foo
$ df | grep '/net/some.nfs.server/foo'
(expected: mount info for /net/some.nfs.server/foo)
$ sleep 120 (or whatever your autofs timeout is)
$ df | grep '/net/some.nfs.server/foo'
(expected: nothing)

Revision history for this message
Martin Pitt (pitti) wrote :

Easy fix.

Changed in hal:
assignee: nobody → pitti
importance: Undecided → Low
status: Unconfirmed → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :

Fixed upstream in 0.5.8.1, thus fixed in Feisty.

Changed in hal:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.