[865g] Kernel Oops - BUG: unable to handle kernel paging request at 80000013; EIP is at free_rb_tree_fname+0x5a/0xc0

Bug #585734 reported by Thomas Tanghus
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

System freezes apparently at random. kern.log at the time of a freeze is below.

Compiz is disabled. MemTest ran for 6+ hours without an error.

===kern.log.txt (Comment #17)===
May 31 23:56:49 tanghus kernel: [21498.132183] BUG: unable to handle kernel paging request at 80000013
May 31 23:56:49 tanghus kernel: [21498.132199] IP: [<c0263eea>] free_rb_tree_fname+0x5a/0xc0
May 31 23:56:49 tanghus kernel: [21498.132217] *pde = 00000000
May 31 23:56:49 tanghus kernel: [21498.132224] Oops: 0000 [#1] SMP
May 31 23:56:49 tanghus kernel: [21498.132232] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:0c.0/irq
May 31 23:56:49 tanghus kernel: [21498.132238] Modules linked in: binfmt_misc snd_intel8x0 fbcon tileblit font bitblit softcursor vga16fb snd_ac97_codec vgastate ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event i915 snd_seq drm_kms_helper dell_wmi snd_timer drm snd_seq_device dcdbas snd i2c_algo_bit ppdev psmouse soundcore intel_agp parport_pc snd_page_alloc serio_raw shpchp video agpgart output lp parport usbhid hid floppy e1000
May 31 23:56:49 tanghus kernel: [21498.132333]
May 31 23:56:49 tanghus kernel: [21498.132340] Pid: 4231, comm: drkonqi Not tainted (2.6.32-22-generic #33-Ubuntu) OptiPlex SX270
May 31 23:56:49 tanghus kernel: [21498.132347] EIP: 0060:[<c0263eea>] EFLAGS: 00210206 CPU: 1
May 31 23:56:49 tanghus kernel: [21498.132354] EIP is at free_rb_tree_fname+0x5a/0xc0
May 31 23:56:49 tanghus kernel: [21498.132360] EAX: 7fffffff EBX: 00000000 ECX: ed21ee98 EDX: 00000000
May 31 23:56:49 tanghus kernel: [21498.132365] ESI: ed21ee98 EDI: 7fffffff EBP: f0bbfe80 ESP: f0bbfe6c
May 31 23:56:49 tanghus kernel: [21498.132371] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
May 31 23:56:49 tanghus kernel: [21498.132377] Process drkonqi (pid: 4231, ti=f0bbe000 task=ee276680 task.ti=f0bbe000)
May 31 23:56:49 tanghus kernel: [21498.132382] Stack:
May 31 23:56:49 tanghus kernel: [21498.132386] ed21ee98 ed21e8c0 ed21e8c0 efa77a00 f6cbbe9c f0bbfeb0 c02640c8 00000020
May 31 23:56:49 tanghus kernel: [21498.132402] <0> 000080d0 00000000 c0216990 f0bbff90 00000000 00000000 f68e5e00 efa77a00
May 31 23:56:49 tanghus kernel: [21498.132420] <0> f6cbbe9c f0bbff64 c02646db f0bbfeec f0bbff40 f0bbfed4 c02070bd 00000000
May 31 23:56:49 tanghus kernel: [21498.132440] Call Trace:
May 31 23:56:49 tanghus kernel: [21498.132450] [<c02640c8>] ? ext3_dx_readdir+0x58/0x230
May 31 23:56:49 tanghus kernel: [21498.132459] [<c0216990>] ? filldir+0x0/0xd0
May 31 23:56:49 tanghus kernel: [21498.132468] [<c02646db>] ? ext3_readdir+0x43b/0x520
May 31 23:56:49 tanghus kernel: [21498.132478] [<c02070bd>] ? vfs_statfs+0x6d/0x80
May 31 23:56:49 tanghus kernel: [21498.132487] [<c020723e>] ? vfs_statfs_native+0x1e/0xf0
May 31 23:56:49 tanghus kernel: [21498.132494] [<c0216990>] ? filldir+0x0/0xd0
May 31 23:56:49 tanghus kernel: [21498.132502] [<c022002f>] ? mntput_no_expire+0x1f/0xe0
May 31 23:56:49 tanghus kernel: [21498.132509] [<c02114b5>] ? path_put+0x25/0x30
May 31 23:56:49 tanghus kernel: [21498.132509] [<c02642a0>] ? ext3_readdir+0x0/0x520
May 31 23:56:49 tanghus kernel: [21498.132509] [<c0216bd6>] ? vfs_readdir+0x96/0xb0
May 31 23:56:49 tanghus kernel: [21498.132509] [<c0216990>] ? filldir+0x0/0xd0
May 31 23:56:49 tanghus kernel: [21498.132509] [<c0216d1d>] ? sys_getdents+0x6d/0xd0
May 31 23:56:49 tanghus kernel: [21498.132509] [<c01033ec>] ? syscall_call+0x7/0xb
May 31 23:56:49 tanghus kernel: [21498.132509] Code: db 74 0e 89 de 8b 5b 08 89 d8 eb ec 90 8d 74 26 00 8b 06 83 e0 fc 89 45 ec 89 f0 83 e8 08 75 0b eb 15 8d b4 26 00 00 00 00 89 f8 <8b> 78 14 e8 be 8a f9 ff 85 ff 75 f2 8b 55 ec 85 d2 75 13 8b 55
May 31 23:56:49 tanghus kernel: [21498.132509] EIP: [<c0263eea>] free_rb_tree_fname+0x5a/0xc0 SS:ESP 0068:f0bbfe6c
May 31 23:56:49 tanghus kernel: [21498.132509] CR2: 0000000080000013
May 31 23:56:49 tanghus kernel: [21498.132719] ---[ end trace 5a83f49a0e27f0d9 ]---
May 31 23:56:50 tanghus kernel: [21499.392393] BUG: unable to handle kernel NULL pointer dereference at 00000004
May 31 23:56:50 tanghus kernel: [21499.392405] IP: [<c01d2921>] put_page+0xf1/0x120
May 31 23:56:50 tanghus kernel: [21499.392420] *pde = 7f3b0067
May 31 23:56:50 tanghus kernel: [21499.392425] Oops: 0002 [#2] SMP
May 31 23:56:50 tanghus kernel: [21499.392431] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:0c.0/irq
May 31 23:56:50 tanghus kernel: [21499.392436] Modules linked in: binfmt_misc snd_intel8x0 fbcon tileblit font bitblit softcursor vga16fb snd_ac97_codec vgastate ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event i915 snd_seq drm_kms_helper dell_wmi snd_timer drm snd_seq_device dcdbas snd i2c_algo_bit ppdev psmouse soundcore intel_agp parport_pc snd_page_alloc serio_raw shpchp video agpgart output lp parport usbhid hid floppy e1000
May 31 23:56:50 tanghus kernel: [21499.392507]
May 31 23:56:50 tanghus kernel: [21499.392509] Pid: 1365, comm: Xorg Tainted: G D (2.6.32-22-generic #33-Ubuntu) OptiPlex SX270
May 31 23:56:50 tanghus kernel: [21499.392509] EIP: 0060:[<c01d2921>] EFLAGS: 00213286 CPU: 1
May 31 23:56:50 tanghus kernel: [21499.392509] EIP is at put_page+0xf1/0x120
May 31 23:56:50 tanghus kernel: [21499.392509] EAX: ed21efd8 EBX: 00000000 ECX: 00000001 EDX: 00000000
May 31 23:56:50 tanghus kernel: [21499.392509] ESI: 0000001c EDI: 00000007 EBP: ee511ddc ESP: ee511dc8
May 31 23:56:50 tanghus kernel: [21499.392509] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
May 31 23:56:50 tanghus kernel: [21499.392509] Process Xorg (pid: 1365, ti=ee510000 task=ee0b8000 task.ti=ee510000)
May 31 23:56:50 tanghus kernel: [21499.392509] Stack:
May 31 23:56:50 tanghus kernel: [21499.392509] f8030fc2 efc42ec0 f052b400 0000001c 00000007 ee511df4 f844d5e8 00000008
May 31 23:56:50 tanghus kernel: [21499.392509] <0> f052b400 f0ebb1e0 00000000 ee511e24 f844e65a 00000008 edab84c8 ee511e34
May 31 23:56:50 tanghus kernel: [21499.392509] <0> c02094d7 00000020 ee698000 f0c49960 f0ebb1e0 f052b400 ee698000 ee511e3c
May 31 23:56:50 tanghus kernel: [21499.392509] Call Trace:
May 31 23:56:50 tanghus kernel: [21499.392509] [<f8030fc2>] ? agp_generic_free_by_type+0x32/0x40 [agpgart]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f844d5e8>] ? i915_gem_object_put_pages+0x78/0x130 [i915]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f844e65a>] ? i915_gem_object_unbind+0xfa/0x220 [i915]
May 31 23:56:50 tanghus kernel: [21499.392509] [<c02094d7>] ? __fput+0x177/0x1f0
May 31 23:56:50 tanghus kernel: [21499.392509] [<f844eb3d>] ? i915_gem_free_object+0x4d/0xf0 [i915]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83c0c9b>] ? drm_gem_object_free+0x2b/0x60 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83c0c70>] ? drm_gem_object_free+0x0/0x60 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<c034cb3d>] ? kref_put+0x2d/0x60
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83c0dd9>] ? drm_gem_close_ioctl+0x99/0xc0 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83bf7cd>] ? drm_ioctl+0x25d/0x3e0 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83c0d40>] ? drm_gem_close_ioctl+0x0/0xc0 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<c01daaef>] ? vma_prio_tree_remove+0x7f/0xf0
May 31 23:56:50 tanghus kernel: [21499.392509] [<c01e8a1d>] ? __remove_shared_vm_struct+0x3d/0x60
May 31 23:56:50 tanghus kernel: [21499.392509] [<c01f1128>] ? free_pages_and_swap_cache+0x18/0xc0
May 31 23:56:50 tanghus kernel: [21499.392509] [<f83bf570>] ? drm_ioctl+0x0/0x3e0 [drm]
May 31 23:56:50 tanghus kernel: [21499.392509] [<c0216231>] ? vfs_ioctl+0x21/0x90
May 31 23:56:50 tanghus kernel: [21499.392509] [<c01e85b7>] ? remove_vma+0x47/0x60
May 31 23:56:50 tanghus kernel: [21499.392509] [<c0216519>] ? do_vfs_ioctl+0x79/0x310
May 31 23:56:50 tanghus kernel: [21499.392509] [<c0216817>] ? sys_ioctl+0x67/0x80
May 31 23:56:50 tanghus kernel: [21499.392509] [<c01033ec>] ? syscall_call+0x7/0xb
May 31 23:56:50 tanghus kernel: [21499.392509] Code: 02 00 8b 55 ec 8b 45 f0 e8 0d 8e 3b 00 e9 5b ff ff ff 0f ba 33 15 ba 05 00 00 00 be 04 00 00 00 eb d0 89 c2 eb cc 66 85 c0 78 19 <f0> ff 4b 04 0f 94 c0 84 c0 0f 84 3a ff ff ff 89 d8 ff 53 38 e9
May 31 23:56:50 tanghus kernel: [21499.392509] EIP: [<c01d2921>] put_page+0xf1/0x120 SS:ESP 0068:ee511dc8
May 31 23:56:50 tanghus kernel: [21499.392509] CR2: 0000000000000004
May 31 23:56:50 tanghus kernel: [21499.393036] ---[ end trace 5a83f49a0e27f0da ]---

===Xorg.0.log.old.txt (Comment #4, different time from above)===
[Backtrace]
0: /usr/bin/X (xorg_backtrace+0x3b) [0x80e937b]
1: /usr/bin/X (0x8048000+0x61c7d) [0x80a9c7d]
2: (vdso) (__kernel_rt_sigreturn+0x0) [0xbe9410]
3: /usr/bin/X (SetClipRects+0xc5) [0x8086655]
4: /usr/bin/X (0x8048000+0x28f11) [0x8070f11]
5: /usr/bin/X (0x8048000+0x2a477) [0x8072477]
6: /usr/bin/X (0x8048000+0x1ed7a) [0x8066d7a]
7: /lib/tls/i686/cmov/libc.so.6 (__libc_start_main+0xe6) [0x14cbd6]
8: /usr/bin/X (0x8048000+0x1e961) [0x8066961]
Segmentation fault at address 0x20001

Caught signal 11 (Segmentation fault). Server aborting

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: xserver-xorg-video-intel 2:2.9.1-3ubuntu5
ProcVersionSignature: Ubuntu 2.6.32-22.33-generic 2.6.32.11+drm33.2
Uname: Linux 2.6.32-22-generic i686
Architecture: i386
Date: Wed May 26 09:34:10 2010
DkmsStatus: Error: [Errno 2] No such file or directory
GdmLog:
 Error: command ['kdesudo', '--', 'cat', '/var/log/gdm/:0.log'] failed with exit code 1: QInotifyFileSystemWatcherEngine::addPaths: inotify_add_watch failed: No such file or directory
 QFileSystemWatcher: failed to add paths: /home/tol/.config/ibus/bus
 Bus::open: Can not get ibus-daemon's address.
 IBusInputContext::createInputContext: no connection to ibus-daemon
 cat: /var/log/gdm/:0.log: No such file or directory
GdmLog1: Error: command ['kdesudo', '--', 'cat', '/var/log/gdm/:0.log.1'] failed with exit code 1: cat: /var/log/gdm/:0.log.1: No such file or directory
GdmLog2: Error: command ['kdesudo', '--', 'cat', '/var/log/gdm/:0.log.2'] failed with exit code 1: cat: /var/log/gdm/:0.log.2: No such file or directory
InstallationMedia: Kubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100427)
MachineType: Dell Computer Corporation OptiPlex SX270
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-22-generic root=UUID=bdee719a-c37d-4d87-8997-b20ccc3f6961 ro
ProcEnviron:
 LANGUAGE=
 PATH=(custom, user)
 LANG=en_DK.UTF-8
 SHELL=/bin/bash
SourcePackage: xserver-xorg-video-intel
dmi.bios.date: 09/29/2004
dmi.bios.vendor: Dell Computer Corporation
dmi.bios.version: A06
dmi.board.name: 0U8211
dmi.board.vendor: Dell Computer Corp.
dmi.board.version: A00
dmi.chassis.type: 16
dmi.chassis.vendor: Dell Computer Corporation
dmi.modalias: dmi:bvnDellComputerCorporation:bvrA06:bd09/29/2004:svnDellComputerCorporation:pnOptiPlexSX270:pvr:rvnDellComputerCorp.:rn0U8211:rvrA00:cvnDellComputerCorporation:ct16:cvr:
dmi.product.name: OptiPlex SX270
dmi.sys.vendor: Dell Computer Corporation
system:
 distro: Ubuntu
 codename: lucid
 architecture: i686
 kernel: 2.6.32-22-generic

Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :

Now I just had what seemed to be a kernel crash caused by X. I was taken directly to the console which had a lot of text starting with something like "Oops BUG:" but I was unable to save the output because every command I issued just typed the backtrace on the terminal.
I did however find /var/log/Xorg.0.log.old which had the correct date and time and contained a short bt.
I have installed xserver-xorg-core-dbg and xserver-xorg-video-intel-dbg but I don't know how to get the correct backtrace.

I'll attach the log.

Revision history for this message
Thomas Tanghus (tanghus) wrote :
Bryce Harrington (bryce)
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Confirmed
Revision history for this message
Stenten (stenten) wrote :

Excellent, thank you for the Xorg.0.log.old. Next time, do you mind attaching logs as .txt? It makes them easier to open in a browser. (I've re-uploaded it for you as a .txt.)

Also, do you mind attaching a dmesg from /var/log that has a timestamp that coincides with the freeze? You can take a look through it and see if you find anything unusual; if so that's the one to upload.

description: updated
tags: added: 865g
summary: - X locks up solid with HD thrashing
+ [865g] X locks up solid with HD thrashing
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Thomas Tanghus (tanghus) wrote : Re: [865g] X locks up solid with HD thrashing

Hi Stenten. Sorry I wasn't aware I didn't upload it as text but I guess it's enough to give it a .txt extension?
I would have liked to attach dmesg output but as mentioned I wasn't able to issue any commands until after reboot.
Just had another solid lockup where ALT-SysReq didn't even help.

Revision history for this message
Stenten (stenten) wrote :

Do you have more than one dmesg in /var/log? I have dmesg and dmesg.0, the latter of which is an old dmesg specifically for situations like this. Old kern.log's and syslog's might also have interesting errors.

You could probably also SSH into your frozen computer, but I'm really no expert at this.

You might also want to try disabling Desktop Effects (compiz) through System->Preferences->Appearance->Visual Effects (Tab). Also pay attention to whether you've hibernated or suspended anytime before the freeze.

After those, try the mainline kernel [1]. Download and install the _all headers, the i386 headers, and the i386 image.

(And yes, it's as easy as adding .txt to its filename.)

[1]: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.34-lucid/

Bryce Harrington (bryce)
tags: added: kubuntu
Revision history for this message
Thomas Tanghus (tanghus) wrote :

So now I ran without Desktop Effects for ~24hrs and got a crash that didn't lock up the PC. Xorg.0.log didn't show anything interesting but kern.log had some good info which I'll attach.
Next step I'll try the mainline kernel as suggested. Should I just download the *.deb files or could I add a line to sources.list - and if the latter what should it look like? (apt sometimes is a bit confusing to me).

Revision history for this message
Stenten (stenten) wrote :

Thank you. That was very helpful. Can you also tell me the date and time you disabled Compiz? It'd be nice to know where exactly in kern.log you made that switch, since errors are everywhere, dating back to the 25th.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

It must have been around 14:00 (GMT+1) on the 27th. I'm not sure about the time but the date is correct.

Revision history for this message
Stenten (stenten) wrote :

There are at least three different kernel oopses, and most of them refer to different processes.

Could you please keep track of the date/time of your next three crashes and then upload the kern.log? This should give a definitive answer as to which error messages are actually related to the crash. Thank you.

Revision history for this message
Stenten (stenten) wrote :

On Fri, May 28, 2010 at 9:07 AM, Thomas "Tanghus" Olsen
<email address hidden>wrote:

> Next step I'll try the mainline kernel as suggested. Should I just download
> the *.deb files or could I add a line to sources.list - and if the latter
> what should it look like? (apt sometimes is a bit confusing to me).

You just have to download the three files and open them. Just don't do this
until you record the three crashes. Then go ahead and install the mainline
and report if the problem still exists.

Revision history for this message
Thomas Tanghus (tanghus) wrote : Re: [865g] X locks up solid with HD thrashing

Will do. Thanks for helping.

Stenten (stenten)
affects: xserver-xorg-video-intel (Ubuntu) → linux (Ubuntu)
Revision history for this message
Stenten (stenten) wrote :

Also please run a MemTest (in the Grub menu, hold down Shift while booting to get there). Let it run overnight; RAM tends to only show symptoms of breakage once it's warmed up or under prolonged stress.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

Just had two crashes. One at 12:12 (noon. Dunno if it's am or pm?) from which I had to do power-off to reboot and one at ~12:32 where I could ALT-SysRq-s and ALT-SysRq-b to sync'n'reboot. Couldn't find anything in any logs but I'll attach kern.log.

Revision history for this message
Stenten (stenten) wrote :

Also check your Xorg.0.log.old immediately after a crash for any warnings or errors, and especially backtraces.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

Ran memtest for +6 hrs and it passed with flying collars (spelling?) although running VERY warm.
I've checked Xorg.0.log every time but it rarely shows anything.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

Just had another crash at 23:56:49. Xorg.0.log shows nothing but I'll attach kern.log.
Don't know if it was a coincidence or the trigger but I was trying out the new fancy opendesktop.org integration in Amaroks About dialog.

Stenten (stenten)
description: updated
Stenten (stenten)
description: updated
Revision history for this message
Stenten (stenten) wrote :

Excellent, thank you. Mentioning the exact timestamp was very helpful.

Xorg.0.log won't have debugging information because it's the current log for your rebooted session. To get the log of the previous session where the freeze occurred, check Xorg.0.log.old. Check the timestamp in the log (first entry) to make sure it coincides with when you first booted the session that froze.

Could you also mention the symptoms of your subsequent freezes when you're posting logs?
1) Does the Capslock light work if you press the Capslock button?
2) Can you switch to a virtual terminal with Ctrl+Alt+F1?
3) Can you log out with Ctrl+Alt+Del?
4) Can you kill X with Alt+Sysrq+k?
5) Can you reboot with REISUB?

Also remember to test with the mainline kernel.

Stenten (stenten)
tags: added: kernel-bug kernel-oops needs-upstream-testing
Revision history for this message
Thomas Tanghus (tanghus) wrote :

Glad to be able to help :-)
I can see from your changes - and also realized it myself - that it's a kernel bug rather than an X bug. I also checked Xorg.0.log.old but it had last been touched at 18:03 so I figured it would be of any use. I'll attach it here anyways.

For your other questions:

1) Haven't tested that. Will do.
2+3) Neither does anything.
4) Haven't tried that yet but sometimes I have been able to reboot with Alt+SysRq+s followed by Alt+SysRq+b.
5) Is that some Magic SysReq combo?

I'll download mainline soonish. Is that the kernel that will be used in later updates?

Revision history for this message
Stenten (stenten) wrote :

REISUB is indeed a Magic SysRq combination. It's like your s+b combo, but
more complete. See the wiki page [1] for more information on what exactly
that combo does.

[1]: http://en.wikipedia.org/wiki/Magic_SysRq_key

Revision history for this message
Thomas Tanghus (tanghus) wrote : Re: [865g] X locks up solid with HD thrashing

A short follow-up: I've been using the mainline kernel for 3 days now and I've had 3 kernel crashes. No response on Caps-lock or Num-lock. REISUB has no effect. No messages in kern.log at all.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash. Could REISUB out of it but still nothing in kern.log, Xorg.0.log* or anywhere I can think of. Any place else I should look?
I do get several errors like the one below in kern.log but they don't seem to cause the crash:

Jun 5 07:50:52 tanghus kernel: [25896.712240] [drm:i915_handle_error] *ERROR* EIR stuck: 0x00000010, masking
Jun 5 07:50:52 tanghus kernel: [25896.712259] render error detected, EIR: 0x00000010

I'll attach the log again as there are also several backtraces that might be of interest.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

Switch back to 2.6.32-22-generic because mainline crashed even more often. Just had another crash but it only took down X. It is reported in kern.log at 22:24:59.

Revision history for this message
Stenten (stenten) wrote :

Please try the 2.6.35-rc1 kernel [1]. See if you can capture anything in '/sys/kernel/debug/dri/0/i915_error_state' after a crash. If so, please attach that file, along with the output of 'dmesg', /var/log/dmesg, /var/log/dmesg.0 /var/log/kern.log, /var/log/Xorg.0.log, and /var/log/Xorg.0.log.old. Also include a description of what happened.

It would be nice if you could upload the above more than once so we can compare and try to find some consistency. If you can't produce any information in i915_error_state, consistent tracebacks in dmesg/kern.log are just as useful, if they happen to exist.

[1]: http://kernel.ubuntu.com/~kernel-ppa/mainline/v2.6.35-rc1-lucid/

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash at June 9th at 01:38. Only log I could get something from was Xorg.0.old but it had bt.
X crashed and I was left at the terminal tried to get the state of /sys/kernel/debug/dri/0/i915_error_state but the box totally froze before I had a chance.
I'll attach the logs but I think only Xorg.0.log.old.txt is relevant.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

BTW: This was with 2.6.34 bc I couldn't even boot with 2.6.35rc1.

Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash. The only thing I was able to get was from kern.log:

Jun 9 01:53:29 tanghus kernel: [ 552.593095] nepomukservices[2030]: segfault at 7 ip b6221ca8 sp bffd7a30 error 4 in libsoprano.so
.4.3.0[b61dc000+ef000]
Jun 9 01:53:49 tanghus kernel: [ 572.402565] nepomukservices[2349]: segfault at 4244483 ip b61b5cb3 sp bfa66ab0 error 6 in libsopr
ano.so.4.3.0[b6170000+ef000]
Jun 9 01:54:03 tanghus kernel: [ 586.740678] nepomukservices[2384]: segfault at 14 ip b619fcab sp bffaa010 error 4 in libsoprano.so.4.3.0[b615a000+ef000]
Jun 9 01:54:09 tanghus kernel: [ 591.803666] nepomukservices[2400]: segfault at 5 ip b60b4ca8 sp bfe32980 error 4 in libsoprano.so.4.3.0[b606f000+ef000]
Jun 9 01:54:10 tanghus kernel: [ 593.627514] nepomukservices[2413]: segfault at 14 ip b61c7ca8 sp bfb05b10 error 4 in libsoprano.so.4.3.0[b6182000+ef000]
Jun 9 01:54:13 tanghus kernel: [ 595.825997] nepomukservices[2424]: segfault at 62696c33 ip b60faca8 sp bfd2e930 error 4 in libsoprano.so.4.3.0[b60b5000+ef000]
Jun 9 05:56:40 tanghus kernel: imklog 4.2.0, log source = /proc/kmsg started.

I have no idea if it is a coincidence but the errors coincide with this bug:

https://bugs.launchpad.net/ubuntu/+source/soprano/+bug/590088

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New total crash on June 10th at 12:01:48. Only kern.log had anything useful as I had to power off.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash on June 10 at 22:46:48 with 2.6.34-020634-generic.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash June 11th at 02:05:44. Attached kern.log and Xorg.0.log.old.

Revision history for this message
Thomas Tanghus (tanghus) wrote :
Revision history for this message
Thomas Tanghus (tanghus) wrote :

Crash at June 12th 13:25.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash on June 12th at 16:15:41. I was listening to a radio stream with Amarok and the monitor was suspended. I moved the mouse to wake it up and a few seconds after it did X crashed and I was left with a console unable to do anything. The funny is was that the stream kept playing until I REISUB'd.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

New crash on June 15th at ~07:29. I was listening to a radio stream in Amarok with the monitor turned off. Had to do a power off. Nothing in the logs at that time but I'll attach kern.log anywas as there are several errors in it.

Stenten (stenten)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Stenten (stenten) wrote :

Thank you, that should be all the information the developers should need.

It seems three things are associated with the freezes:
1) The null pointer dereference kernel oops mentioned in the description, along with most kern.log's.
2) The segfault in soprano.so (mentioned in Comment #31)
3) The GPU hang, which might be more of a symptom than a cause.

Revision history for this message
Thomas Tanghus (tanghus) wrote :

OK, great.
Just out of curiosity I have a few questions from which you can probably see how totally blank I am about this ;-):

- I'm guessing this Oops's is handled upstream?
- Are they hardware specific (GPU)? Or driver; or both :-)
- Is soprano.so (partly?) responsible for this? The I'd guess it's a KDE/OpenDesktop problem?

And just to close I have a very nice traceback in kern.log from this morning at 08:30:28 where - as mentioned before - Amarok kept playing but X was totally frozen. The monitor was turn off and as can be seen in the log I turned it on again at 13:07:43 shortly thereafter realizing that I had to REISUB. The I first try to boot the newly updated standard kernel but now even that wont boot so I'm glad you pointed me to 2.6.34-020634-generic :-)
Thanks for the effort. Hoping this will be fixed as my budget is not for buying a new PC...

Revision history for this message
Thomas Tanghus (tanghus) wrote :
Download full text (4.4 KiB)

I don't think this has been there before?:

Jun 20 17:18:13 tanghus kernel: [23632.437395] ------------[ cut here ]------------
Jun 20 17:18:13 tanghus kernel: [23632.437458] kernel BUG at /home/kernel-ppa/mainline/build/mm/slub.c:2846!
Jun 20 17:18:13 tanghus kernel: [23632.437526] invalid opcode: 0000 [#1] SMP
Jun 20 17:18:13 tanghus kernel: [23632.437585] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:0c.0/local_cpus
Jun 20 17:18:13 tanghus kernel: [23632.437663] Modules linked in: binfmt_misc fbcon tileblit font bitblit softcursor snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm i915 drm_kms_helper snd_seq_dummy drm i2c_algo_bit snd_seq_oss snd_seq_midi psmouse intel_agp video snd_rawmidi snd_seq_midi_event snd_seq snd_timer ppdev dcdbas snd_seq_device parport_pc shpchp lp agpgart snd output serio_raw parport soundcore snd_page_alloc usbhid hid e1000 floppy
Jun 20 17:18:13 tanghus kernel: [23632.438523]
Jun 20 17:18:13 tanghus kernel: [23632.438546] Pid: 664, comm: Xorg Not tainted 2.6.34-020634-generic #020634 0U8211/OptiPlex SX270
Jun 20 17:18:13 tanghus kernel: [23632.438641] EIP: 0060:[<c0205341>] EFLAGS: 00213246 CPU: 0
Jun 20 17:18:13 tanghus kernel: [23632.438701] EIP is at kfree+0x101/0x110
Jun 20 17:18:13 tanghus kernel: [23632.438744] EAX: 40000000 EBX: c1723c40 ECX: 00000000 EDX: 00000000
Jun 20 17:18:13 tanghus kernel: [23632.438805] ESI: f0437720 EDI: ed15f800 EBP: e8201df0 ESP: e8201dd8
Jun 20 17:18:13 tanghus kernel: [23632.438865] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Jun 20 17:18:13 tanghus kernel: [23632.438919] Process Xorg (pid: 664, ti=e8200000 task=ed10a670 task.ti=e8200000)
Jun 20 17:18:13 tanghus kernel: [23632.438990] Stack:
Jun 20 17:18:13 tanghus kernel: [23632.439014] e8201df8 f08e2180 00000000 f08e2180 f0437720 ed15f800 e8201e08 f842d1fd
Jun 20 17:18:13 tanghus kernel: [23632.439139] <0> e8201e08 ed15f814 ed15f800 f0437720 e8201e1c f82cdc27 f0437720 f82cdbc0
Jun 20 17:18:13 tanghus kernel: [23632.439277] <0> e8201e5c e8201e2c c034722d f0437720 f0437720 e8201e38 f82ce248 00000000
Jun 20 17:18:13 tanghus kernel: [23632.439424] Call Trace:
Jun 20 17:18:13 tanghus kernel: [23632.439474] [<f842d1fd>] ? i915_gem_free_object+0x6d/0xa0 [i915]
Jun 20 17:18:13 tanghus kernel: [23632.439551] [<f82cdc27>] ? drm_gem_object_free_unlocked+0x67/0x70 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.439629] [<f82cdbc0>] ? drm_gem_object_free_unlocked+0x0/0x70 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.439699] [<c034722d>] ? kref_put+0x2d/0x70
Jun 20 17:18:13 tanghus kernel: [23632.439756] [<f82ce248>] ? drm_gem_object_release_handle+0x28/0x30 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.439825] [<c0345076>] ? idr_for_each+0xa6/0xd0
Jun 20 17:18:13 tanghus kernel: [23632.439884] [<f82ce220>] ? drm_gem_object_release_handle+0x0/0x30 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.439963] [<f82ccc61>] ? drm_events_release+0xa1/0xe0 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.440013] [<f82cdc8a>] ? drm_gem_release+0x1a/0x30 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.440013] [<f82cd324>] ? drm_release+0x314/0x3d0 [drm]
Jun 20 17:18:13 tanghus kernel: [23632.44001...

Read more...

Stenten (stenten)
tags: added: kernel-uncat
removed: kernel-bug needs-upstream-testing
Revision history for this message
Thomas Tanghus (tanghus) wrote :
Download full text (5.7 KiB)

I don't know if this is the same bug. The first excerpt from kern.log didn't lock up the box but 2-3 minutes later X crashed with the bt below:

Jul 1 21:30:38 tanghus kernel: [ 181.155905] virtuoso-t[2154] general protection ip:8429e5c sp:b78831c0 error:0 in virtuoso-t[8048000+88d000]
Jul 1 21:31:44 tanghus kernel: [ 247.572624] BUG: unable to handle kernel paging request at 00100d68
Jul 1 21:31:44 tanghus kernel: [ 247.572639] IP: [<c017fc2a>] module_put+0x1a/0xa0
Jul 1 21:31:44 tanghus kernel: [ 247.572658] *pde = 7e74b067
Jul 1 21:31:44 tanghus kernel: [ 247.572664] Oops: 0000 [#1] SMP
Jul 1 21:31:44 tanghus kernel: [ 247.572672] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:0c.0/local_cpus
Jul 1 21:31:44 tanghus kernel: [ 247.572679] Modules linked in: dm_crypt snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device ppdev dell_wmi dcdbas snd parport_pc soundcore shpchp snd_page_alloc lp parport usbhid hid fbcon tileblit font bitblit softcursor vga16fb vgastate i915 drm_kms_helper drm i2c_algo_bit video intel_agp e1000 output floppy agpgart
Jul 1 21:31:44 tanghus kernel: [ 247.572772]
Jul 1 21:31:44 tanghus kernel: [ 247.572780] Pid: 2123, comm: amarok Not tainted (2.6.32-23-generic #37-Ubuntu) OptiPlex SX270
Jul 1 21:31:44 tanghus kernel: [ 247.572786] EIP: 0060:[<c017fc2a>] EFLAGS: 00210206 CPU: 1
Jul 1 21:31:44 tanghus kernel: [ 247.572795] EIP is at module_put+0x1a/0xa0
Jul 1 21:31:44 tanghus kernel: [ 247.572800] EAX: 00100c0c EBX: 00100c0c ECX: f042a620 EDX: efcd4628
Jul 1 21:31:44 tanghus kernel: [ 247.572806] ESI: 00000008 EDI: efd685d8 EBP: eecf3f44 ESP: eecf3f30
Jul 1 21:31:44 tanghus kernel: [ 247.572811] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Jul 1 21:31:44 tanghus kernel: [ 247.572818] Process amarok (pid: 2123, ti=eecf2000 task=eef88cc0 task.ti=eecf2000)
Jul 1 21:31:44 tanghus kernel: [ 247.572823] Stack:
Jul 1 21:31:44 tanghus kernel: [ 247.572826] eecf3f3c c03216ca eec31a80 00000008 efd685d8 eecf3f74 c0209ddb 00000003
Jul 1 21:31:44 tanghus kernel: [ 247.572844] <0> 00000000 00000000 efcd4628 efd685d8 f7060500 efcd4628 eec31a80 eeaa9400
Jul 1 21:31:44 tanghus kernel: [ 247.572863] <0> ffffffe3 eecf3f7c c0209edd eecf3f94 c02063fc 00200046 eeaa9400 eec31a80
Jul 1 21:31:44 tanghus kernel: [ 247.572883] Call Trace:
Jul 1 21:31:44 tanghus kernel: [ 247.572894] [<c03216ca>] ? apparmor_file_free_security+0x2a/0x30
Jul 1 21:31:44 tanghus kernel: [ 247.572904] [<c0209ddb>] ? __fput+0x10b/0x1f0
Jul 1 21:31:44 tanghus kernel: [ 247.572913] [<c0209edd>] ? fput+0x1d/0x30
Jul 1 21:31:44 tanghus kernel: [ 247.572922] [<c02063fc>] ? filp_close+0x4c/0x80
Jul 1 21:31:44 tanghus kernel: [ 247.572931] [<c02064a5>] ? sys_close+0x75/0xc0
Jul 1 21:31:44 tanghus kernel: [ 247.572940] [<c01033ec>] ? syscall_call+0x7/0xb
Jul 1 21:31:44 tanghus kernel: [ 247.572945] Code: 00 83 c4 04 5b 5e 5f 5d c3 90 8d b4 26 00 00 00 00 55 89 e5 83 ec 14 89 5d f4 89 75 f8 89 7d fc 0f 1f 44 00 00 85 c0 89 c3 74 36 <8b> 80 5c 01 00 00 64 8b 15 40 1a ...

Read more...

Revision history for this message
penalvch (penalvch) wrote :

Thomas "Tanghus" Olsen, thank you for reporting this bug and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command in the development release from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

summary: - [865g] X locks up solid with HD thrashing
+ [865g] Kernel Oops - BUG: unable to handle kernel paging request at
+ 80000013; EIP is at free_rb_tree_fname+0x5a/0xc0
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Thomas Tanghus (tanghus) wrote :

I had totally forgotten about this report. Since then the onboard video chip was toasted (in a lunchbox formfactor PC), so I don't have an Intel chip to test on anymore. Perhaps the symptoms came from the fact that the chip was already failing?
Anyways, since I'm the only one listed as affected I guess this report can be safely closed as invalid.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.