Comment 9 for bug 150469

Revision history for this message
Joachim Dahl (jdahl) wrote : Re: [Bug 150469] Re: openafs gives segfault on kernel 2.6.22-13

I only have one kernel installed (I removed the previous kernel before I
realized the
afs problem), so it shouldn't be an issue of loading a tainted version
kernel module.

I am attaching the full output from the starting the client as well as
dmesg output below.

Thanks
joachim

joachim@jod-nb:~$ sudo sh -x /etc/init.d/openafs-client force-start
+ PATH=/bin:/usr/bin:/sbin:/usr/sbin
+ CACHEINFO=/etc/openafs/cacheinfo
+ uname -r
+ MODULEDIR=/lib/modules/2.6.22-13-generic/fs
+ exec
+ exec
+ [ -f /etc/openafs/afs.conf ]
+ . /etc/openafs/afs.conf
+ test -f /etc/openafs/afs.conf.client
+ . /etc/openafs/afs.conf.client
+ AFS_CLIENT=true
+ AFS_AFSDB=true
+ AFS_CRYPT=true
+ AFS_DYNROOT=false
+ AFS_FAKESTAT=true
+ VERBOSE=
+ OPTIONS=AUTOMATIC
+ AFS_POST_INIT=
+ AFS_PRE_SHUTDOWN=
+ test -x /sbin/afsd
+ echo -n Starting AFS services:
Starting AFS services:+ load_client
+ [ -z ]
+ choose_client
+ uname -v
+ set X #1 SMP Thu Oct 4 17:18:44 GMT 2007
+ shift
+ MP=.mp
+ [ -n .mp -a -f /lib/modules/2.6.22-13-generic/fs/openafs.mp.o ]
+ [ -n .mp -a -f /lib/modules/2.6.22-13-generic/fs/openafs.mp.ko ]
+ [ -f /lib/modules/2.6.22-13-generic/fs/openafs.ko ]
+ MP=
+ LIBAFS=openafs.ko
+ [ ! -f /lib/modules/2.6.22-13-generic/fs/openafs.ko ]
+ /sbin/lsmod
+ fgrep openafs
+ LOADED=
+ [ -z ]
+ modprobe openafs
+ status=0
+ [ 0 = 0 ]
+ echo -n openafs
 openafs+ return 0
+ start_client
+ pidof /sbin/afsd
+ pidof /usr/sbin/afsd
+ choose_afsd_options
+ [ -z AUTOMATIC ]
+ [ AUTOMATIC = AUTOMATIC ]
+ AFSD_OPTIONS=
+ is_on true
+ [ xtrue = xtrue ]
+ return 0
+ AFSD_OPTIONS= -afsdb
+ is_on false
+ [ xfalse = xtrue ]
+ return 1
+ is_on true
+ [ xtrue = xtrue ]
+ return 0
+ AFSD_OPTIONS= -afsdb -fakestat
+ echo afsd.
 afsd.
+ start-stop-daemon --start --quiet --exec /sbin/afsd -- -afsdb -fakestat
afsd: All AFS daemons started.
+ is_on true
+ [ xtrue = xtrue ]
+ return 0
+ fs setcrypt on
fs: Invalid argument.
+ [ -n ]
+

Output from dmesg after klog gives segfault:
.
.
.
[10736.308000] openafs: module license
'http://www.openafs.org/dl/license10.html' taints kernel.
[10736.440000] Found system call table at 0xc02fc540 (pattern scan)
[10736.564000] Starting AFS cache scan...found 1776 non-empty cache
files (56%).
[11019.800000] BUG: unable to handle kernel NULL pointer dereference at
virtual address 00000000
[11019.800000] printing eip:
[11019.800000] f9b33c3c
[11019.800000] *pde = 00000000
[11019.800000] Oops: 0000 [#1]
[11019.800000] SMP
[11019.800000] Modules linked in: openafs(P) tun af_packet binfmt_misc
i915 drm rfcomm hidp hid l2cap ppdev ipv6 sbs bay video battery button
container ac dock cpufreq_stats cpufreq_ondemand freq_table
cpufreq_powersave cpufreq_userspace cpufreq_conservative lp joydev arc4
ecb blkcipher snd_hda_intel snd_pcm_oss snd_mixer_oss snd_pcm
snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event
snd_seq iwl4965 pcmcia irda iwlwifi_mac80211 hci_usb snd_timer
snd_seq_device bluetooth cfg80211 crc_ccitt parport_pc parport sky2
pcspkr psmouse snd soundcore snd_page_alloc yenta_socket rsrc_nonstatic
pcmcia_core shpchp pci_hotplug sdhci mmc_core serio_raw intel_agp
agpgart evdev sr_mod cdrom ext3 jbd mbcache sg sd_mod ata_generic
ehci_hcd ahci uhci_hcd ata_piix libata scsi_mod usbcore thermal
processor fan fuse apparmor commoncap
[11019.800000] CPU: 1
[11019.800000] EIP: 0060:[<f9b33c3c>] Tainted: P VLI
[11019.800000] EFLAGS: 00210202 (2.6.22-13-generic #1)
[11019.800000] EIP is at PSetTokens+0x2c/0x210 [openafs]
[11019.800000] eax: 00000000 ebx: efe41000 ecx: f9b4d3c0 edx: 00000001
[11019.800000] esi: 00000000 edi: 00000057 ebp: efe40000 esp: efd85d88
[11019.800000] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
[11019.800000] Process klog (pid: 7066, ti=efd84000 task=f1a7b4c0
task.ti=efd84000)
[11019.800000] Stack: 2d29ad35 f9b4d3c0 efd85e08 ffffffff f025e560
400c5603 efd85e64 c01d1a5f
[11019.800000] 00000000 00000000 00000000 f9ae86c7 c21e7280
f025e560 00000000 00000001
[11019.800000] efe41000 00000000 00000057 efe40000 f9b2e31d
00000000 00000003 efd85e10
[11019.800000] Call Trace:
[11019.800000] [<c01d1a5f>] request_key+0x1f/0x30
[11019.800000] [<f9ae86c7>] PagInCred+0x57/0xc0 [openafs]
[11019.800000] [<f9b2e31d>] afs_HandlePioctl+0x29d/0x4a0 [openafs]
[11019.800000] [<c01fed0e>] copy_from_user+0x2e/0x70
[11019.800000] [<c02f3078>] mutex_lock+0x8/0x20
[11019.800000] [<f9b2e042>] copyin_afs_ioctl+0x92/0xd0 [openafs]
[11019.800000] [<f9b34963>] afs_syscall_pioctl+0x2d3/0x2e0 [openafs]
[11019.800000] [<c02f5bc4>] do_page_fault+0x1b4/0x690
[11019.800000] [<f9b2d55b>] afs_syscall+0x160b/0x1830 [openafs]
[11019.800000] [<c018815e>] permission+0x10e/0x120
[11019.800000] [<c02f5a10>] do_page_fault+0x0/0x690
[11019.800000] [<c02f4292>] error_code+0x72/0x80
[11019.800000] [<c018007b>] rw_copy_check_uvector+0x2b/0xf0
[11019.800000] [<c01817fc>] file_move+0x1c/0x50
[11019.800000] [<c017ec9f>] __dentry_open+0x15f/0x1c0
[11019.800000] [<c017edb5>] nameidata_to_filp+0x35/0x40
[11019.800000] [<f9b238ff>] afs_unlocked_ioctl+0x5f/0x70 [openafs]
[11019.800000] [<f9b238a0>] afs_unlocked_ioctl+0x0/0x70 [openafs]
[11019.800000] [<c018ca1b>] do_ioctl+0x2b/0xc0
[11019.800000] [<c018cb0c>] vfs_ioctl+0x5c/0x290
[11019.800000] [<c018cdb2>] sys_ioctl+0x72/0x90
[11019.800000] [<c01041d2>] sysenter_past_esp+0x6b/0xa9
[11019.800000] [<c02f0000>] clip_ioctl+0x500/0x510
[11019.800000] =======================
[11019.800000] Code: ec 50 8b 15 90 f2 b4 f9 83 05 08 4a b5 f9 01 8b 44
24 54 89 5c 24 40 85 d2 89 74 24 44 89 7c 24 48 89 6c 24 4c 89 4c 24 04
74 44 <8b> 28 81 fd e0 2e 00 00 77 0e 83 c0 04 89 44 24 0c 01 e8 83 38
[11019.800000] EIP: [<f9b33c3c>] PSetTokens+0x2c/0x210 [openafs] SS:ESP
0068:efd85d88

Russ Allbery wrote:
> The error message that you're getting is consistent with the kernel
> module failing to register the AFS system calls. The AFS client
> programs are then trying to make system calls that don't exist, which
> result in various odd errors such as the ones you're seeing.
>
> I'm a little confused by the output you're seeing, given that loading
> the AFS kernel module should result in a message saying that it taints
> the kernel and starting afsd should produce information about the cache,
> neither of which are happening. Are you just not showing all of the
> output that you're getting, or are you missing some output? If you have
> an old kernel module installed that was never fully removed from the
> kernel, things like this can happen. Sometimes AFS doesn't want to
> unload cleanly and the machine has to be rebooted to get back to a
> consistent state, although that's fairly rare these days.
>
> dmesg output may also be useful, in particular the lines about searching
> for the system call table. Again, I don't understand why you're not
> seeing that output when you start the AFS client for the first time
> after boot.
>
>