Comment 30 for bug 974664

Revision history for this message
In , Bert (bert-redhat-bugs) wrote :

Description of problem:

Kernel traceback on NFS4 locking attempt

Version-Release number of selected component (if applicable):

kernel 3.3.1-2 (and 3.3.1-3)

How reproducible:

Always

Steps to Reproduce:
1. Install kernel 3.3.1-{2,3}; mount a users home over NFS4
2. Start e.g. pidgin, which tries to lock some file
3.

Actual results:

*) Pidgin crashes with 'Killed'. Strace shows that the last activity
   is:

[...]
open("/users/visics/deknuydt/.config/enchant/en_US.dic", O_RDONLY) = 11
flock(11, LOCK_EX <unfinished ...>
+++ killed by SIGKILL +++
Killed

*) In dmesg, you see:

 [ 322.987247] BUG: unable to handle kernel paging request at ffffffffffffffb8
[ 322.987330] IP: [<ffffffffa0e140e9>] nfs_have_delegation+0x9/0x40 [nfs]
[ 322.987418] PGD 1c07067 PUD 1c08067 PMD 0
[ 322.987495] Oops: 0000 [#1] SMP
[ 322.987565] CPU 1
[ 322.987574] Modules linked in: nfs fscache nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sha256_generic dm_crypt nvidia(PO) nfsd uinput lockd nfs_acl auth_rpcgss sunrpc snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm iTCO_wdt i2c_i801 iTCO_vendor_support intel_rng microcode r8169 i2c_core mii snd_timer snd soundcore snd_page_alloc serio_raw firewire_ohci firewire_core crc_itu_t sata_sil24 video [last unloaded: scsi_wait_scan]
[ 322.988230]
[ 322.988230] Pid: 1697, comm: pidgin Tainted: P O 3.3.1-3.fc16.x86_64 #1 transtec AG /DG31PR
[ 322.988230] RIP: 0010:[<ffffffffa0e140e9>] [<ffffffffa0e140e9>] nfs_have_delegation+0x9/0x40 [nfs]
[ 322.988230] RSP: 0018:ffff880112e55dd8 EFLAGS: 00010246
[ 322.988230] RAX: ffff880122411800 RBX: ffff880112e55e68 RCX: 00000000ffffd8ca
[ 322.988230] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[ 322.988230] RBP: ffff880112e55dd8 R08: 0000000000016560 R09: ffffea00044b8400
[ 322.988230] R10: ffffffffa0e0457a R11: 0000000000000000 R12: 00000000ffffd8ca
[ 322.988230] R13: ffff8801218be000 R14: ffff88011573cc00 R15: ffff8801224b0480
[ 322.988230] FS: 00007fdda2060980(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
[ 322.988230] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 322.988230] CR2: ffffffffffffffb8 CR3: 0000000112c07000 CR4: 00000000000006e0
[ 322.988230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 322.988230] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 322.988230] Process pidgin (pid: 1697, threadinfo ffff880112e54000, task ffff880112dd2e60)
[ 322.988230] Stack:
[ 322.988230] ffff880112e55e28 ffffffffa0e01fa1 ffff880112e11e60 0000000000000000
[ 322.988230] ffff8801224b0480 ffff8801224b0780 ffff8801224b0480 0000000000000082
[ 322.988230] ffff880112e55e68 00000000fffffff5 ffff880112e55eb8 ffffffffa0e04b9c
[ 322.988230] Call Trace:
[ 322.988230] [<ffffffffa0e01fa1>] nfs4_handle_exception+0x241/0x3a0 [nfs]
[ 322.988230] [<ffffffffa0e04b9c>] nfs4_proc_lock+0xec/0x440 [nfs]
[ 322.988230] [<ffffffffa0de518d>] do_setlk+0xed/0x110 [nfs]
[ 322.988230] [<ffffffffa0de5239>] nfs_flock+0x89/0xe0 [nfs]
[ 322.988230] [<ffffffff811cbd53>] sys_flock+0x113/0x1c0
[ 322.988230] [<ffffffff815fbfe9>] system_call_fastpath+0x16/0x1b
[ 322.988230] Code: fd ff e9 40 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66 66 66 90 f0 80 4f 48 04 5d c3 55 48 89 e5 66 66 66 66 90 <48> 8b 57 b8 31 c0 48 85 d2 74 0c 8b 4a 30 83 e6 03 21 f1 39 f1
[ 322.988230] RIP [<ffffffffa0e140e9>] nfs_have_delegation+0x9/0x40 [nfs]
[ 322.988230] RSP <ffff880112e55dd8>
[ 322.988230] CR2: ffffffffffffffb8
[ 322.988230] ---[ end trace 158525064a4030cd ]---

Expected results:

*) No NFS4 delegation problems.

Additional info:

1) Koji kernel 3.3.1-3 has exactly the same problem.

2) Kernel 3.3.0-8 does not have the problem.

3) Sorry for the tainted kernel (nvidia). I'll confirm with an untainted
   asap.

4) It seems that Ubuntu has this problem too.
   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/974664