[iGM45] X exits when switching ttys on 2.6.3 unless NoAccel used

Bug #345714 reported by Robbie Williamson
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
libdrm (Ubuntu)
Triaged
High
Bryce Harrington
Jaunty
Triaged
High
Bryce Harrington

Bug Description

Binary package hint: xorg

I've had MANY X related issues today since a recent package update on Jaunty, and this one is the most reproducible. Every time I issue a ctrl-alt-f<#> to switch to a tty, the screen goes black, one or two random color ascii characters are displayed, and I'm kicked out of my Xsession. I'm running on a Lenovo X301.

[backtrace]
#0 0xb7fdc424 in __kernel_vsyscall ()
No symbol table info available.
#1 0xb7c6ad59 in ioctl () from /lib/tls/i686/cmov/libc.so.6
No locals.
#2 0x080c86f1 in xf86ProcessActionEvent (action=ACTION_SWITCHSCREEN, arg=0xbfff7d40) at ../../../../hw/xfree86/common/xf86Events.c:238
No locals.
#3 0x081c590b in XkbDDXSwitchScreen (dev=0xd4096d0, key=67 'C', act=0x5606) at ../../../../hw/xfree86/dixmods/xkbVT.c:55
 scrnnum = 1
#4 0x081adfae in _XkbFilterSwitchScreen (xkbi=0xd3cfda8, filter=0xdcf5450, keycode=67, pAction=0xbfff7e48) at ../../xkb/xkbActions.c:979
 dev = (DeviceIntPtr) 0xd4096d0
#5 0x081af598 in XkbHandleActions (dev=0xd4096d0, kbd=0xd4096d0, xE=0xd3d9ee8, count=1) at ../../xkb/xkbActions.c:1216
 key = 67
 bit = <value optimized out>
 realMods = <value optimized out>
 xkbi = (XkbSrvInfoPtr) 0xd3cfda8
 keyc = (KeyClassPtr) 0xd40e4d0
 sendEvent = 1
 genStateNotify = 1
 oldState = {group = 0 '\0', locked_group = 0 '\0', base_group = 0, latched_group = 0, mods = 28 '\034', base_mods = 12 '\f', latched_mods = 0 '\0', locked_mods = 16 '\020', compat_state = 28 '\034', grab_mods = 28 '\034', compat_grab_mods = 28 '\034', lookup_mods = 28 '\034', compat_lookup_mods = 28 '\034', ptr_buttons = 0}
 act = {any = {type = 13 '\r', data = "\005\001\000\000\000\000"}, mods = {type = 13 '\r', flags = 5 '\005', mask = 1 '\001', real_mods = 0 '\0', vmods1 = 0 '\0', vmods2 = 0 '\0'}, group = {type = 13 '\r', flags = 5 '\005', group_XXX = 1 '\001'}, iso = {type = 13 '\r', flags = 5 '\005', mask = 1 '\001', real_mods = 0 '\0', group_XXX = 0 '\0', affect = 0 '\0', vmods1 = 0 '\0', vmods2 = 0 '\0'}, ptr = {type = 13 '\r', flags = 5 '\005', high_XXX = 1 '\001', low_XXX = 0 '\0', high_YYY = 0 '\0', low_YYY = 0 '\0'}, btn = {type = 13 '\r', flags = 5 '\005', count = 1 '\001', button = 0 '\0'}, dflt = {type = 13 '\r', flags = 5 '\005', affect = 1 '\001', valueXXX = 0 '\0'}, screen = {type = 13 '\r', flags = 5 '\005', screenXXX = 1 '\001'}, ctrls = {type = 13 '\r', flags = 5 '\005', ctrls3 = 1 '\001', ctrls2 = 0 '\0', ctrls1 = 0 '\0', ctrls0 = 0 '\0'}, msg = {type = 13 '\r', flags = 5 '\005', message = "\001\000\000\000\000"}, redirect = {type = 13 '\r', new_key = 5 '\005', mods_mask = 1 '\001', mods = 0 '\0', vmods_mask0 = 0 '\0', vmods_mask1 = 0 '\0', vmods0 = 0 '\0', vmods1 = 0 '\0'}, devbtn = {type = 13 '\r', flags = 5 '\005', count = 1 '\001', button = 0 '\0', device = 0 '\0'}, devval = {type = 13 '\r', device = 5 '\005', v1_what = 1 '\001', v1_ndx = 0 '\0', v1_value = 0 '\0', v2_what = 0 '\0', v2_ndx = 0 '\0', v2_value = 0 '\0'}, type = 13 '\r'}
 filter = (XkbFilterPtr) 0x0
 keyEvent = 1
 backupproc = <value optimized out>
#6 0x081af953 in XkbProcessKeyboardEvent (xE=0xd3d9ee8, keybd=0xd4096d0, count=1) at ../../xkb/xkbPrKeyEv.c:186
 keyc = (KeyClassPtr) 0xd40e4d0
 xkbi = (XkbSrvInfoPtr) 0x4e
 key = 67
 ndx = <value optimized out>
 xiEvent = 64
#7 0x081a7810 in AccessXFilterPressEvent (xE=0xd3d9ee8, keybd=0xd4096d0, count=1) at ../../xkb/xkbAccessX.c:559
 xkbi = (XkbSrvInfoPtr) 0xd3cfda8
 ctrls = (XkbControlsPtr) 0xd3d4950
 ignoreKeyEvent = 0
 key = 67 'C'
#8 0x081aff1e in ProcessKeyboardEvent (xE=0xd3d9ee8, keybd=0xd4096d0, count=1) at ../../xkb/xkbPrKeyEv.c:222
 keyc = (KeyClassPtr) 0xd40e4d0
 xkbi = <value optimized out>
 backup_proc = <value optimized out>
 is_press = 1
 is_release = <value optimized out>
#9 0x08114792 in mieqProcessInputEvents () at ../../mi/mieq.c:474
 handler = (mieqHandler) 0
 e = (EventRec *) 0x81f1ce0
 type = 78
 nevents = 1
 evlen = <value optimized out>
 i = <value optimized out>
 screen = (ScreenPtr) 0x9bd37a8
 dev = (DeviceIntPtr) 0xd4096d0
 master = (DeviceIntPtr) 0xd36e3e8
 event = (xEvent *) 0xd3d9ee8
 event_size = 68

[lspci -vvv]
00:00.0 Host bridge: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub (rev 07)
 Subsystem: Lenovo Device 20e0
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
 Subsystem: Lenovo Device 20e4

Revision history for this message
Robbie Williamson (robbiew) wrote :

Attaching Xorg.log.0.old file, and fwiw, this also reproduces with compiz disabled.

Changed in xorg:
importance: Undecided → Critical
Revision history for this message
Robbie Williamson (robbiew) wrote :
Changed in xorg:
status: New → Confirmed
Revision history for this message
Robbie Williamson (robbiew) wrote :
Download full text (3.2 KiB)

I tried switching ttys after I was kicked out to the Ubuntu login screen, and the same black screen with random color ascii characters occurred. However, it took several iterations of the greeter app restarting, plus me having to wait for a 2 minute timeout, before i could log back in. The following info was in /var/log/syslog during the time I could not login:
-------------------
Mar 19 22:41:51 laptop kernel: [ 4471.550198] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:41:51 laptop x-session-manager[10605]: WARNING: Detected that screensaver has left the bus
Mar 19 22:41:51 laptop gdm[10560]: WARNING: gdm_slave_xioerror_handler: Fatal X error - Restarting :0
Mar 19 22:41:52 laptop gdm[10560]: Mount of private directory return code [0]
Mar 19 22:41:52 laptop bonobo-activation-server (robbiew-11556): could not associate with desktop session: Failed to connect to socket /tmp/dbus-CVIWqWWvuc: Connection refused
Mar 19 22:41:53 laptop acpid: client connected from 11586[0:0]
Mar 19 22:41:54 laptop kernel: [ 4475.287600] set status page addr 0x04575000
Mar 19 22:41:57 laptop kernel: [ 4478.262718] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:41:59 laptop acpid: client connected from 11617[0:0]
Mar 19 22:42:01 laptop kernel: [ 4481.803993] set status page addr 0x04575000
Mar 19 22:42:01 laptop gdm[11637]: Gtk-WARNING: Ignoring the separator setting
Mar 19 22:42:04 laptop kernel: [ 4484.616267] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:42:06 laptop acpid: client connected from 11646[0:0]
Mar 19 22:42:07 laptop kernel: [ 4488.119864] set status page addr 0x04575000
Mar 19 22:42:08 laptop gdm[11666]: Gtk-WARNING: Ignoring the separator setting
Mar 19 22:42:08 laptop kernel: [ 4489.104757] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:42:12 laptop acpid: client connected from 11675[0:0]
Mar 19 22:42:14 laptop kernel: [ 4494.736541] set status page addr 0x04575000
Mar 19 22:42:14 laptop gdm[11695]: Gtk-WARNING: Ignoring the separator setting
Mar 19 22:42:15 laptop kernel: [ 4495.802798] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:42:23 laptop acpid: client connected from 11704[0:0]
Mar 19 22:42:25 laptop kernel: [ 4505.363256] set status page addr 0x04575000
Mar 19 22:42:25 laptop gdm[11724]: Gtk-WARNING: Ignoring the separator setting
Mar 19 22:42:26 laptop kernel: [ 4506.508980] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:42:39 laptop acpid: client connected from 11733[0:0]
Mar 19 22:42:40 laptop kernel: [ 4521.102949] set status page addr 0x04575000
Mar 19 22:42:41 laptop gdm[11753]: Gtk-WARNING: Ignoring the separator setting
Mar 19 22:42:42 laptop kernel: [ 4522.382258] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 19 22:43:02 laptop gdm[4146]: WARNING: The display server has been shut down about 6 times in the last 90 seconds. It is likely that something bad is going on. Waiting for 2 minutes ...

Read more...

Revision history for this message
Robbie Williamson (robbiew) wrote :

/var/log/daemon.log had:
-----------------------------------
Mar 19 22:41:51 laptop x-session-manager[10605]: WARNING: Detected that screensaver has left the bus
Mar 19 22:41:51 laptop gdm[10560]: WARNING: gdm_slave_xioerror_handler: Fatal X error - Restarting :0

Revision history for this message
Robbie Williamson (robbiew) wrote :

ii gdm 2.20.10-0ubuntu1 GNOME Display Manager
ii xserver-xorg 1:7.4~5ubuntu16 the X.Org X server
ii xserver-xorg-core 2:1.6.0-0ubuntu3 Xorg X server - core server
ii xserver-xorg-video-all 1:7.4~5ubuntu16 the X.Org X server -- output driver metapack
ii xserver-xorg-video-i128 1:1.3.1-2ubuntu1 X.Org X server -- i128 display driver
ii xserver-xorg-video-i740 1:1.2.0-2 X.Org X server -- i740 display driver
ii xserver-xorg-video-i810 2:2.4.1-1ubuntu10.3 X.Org X server -- Intel i8xx, i9xx display d
ii xserver-xorg-video-intel 2:2.6.3-0ubuntu2 X.Org X server -- Intel i8xx, i9xx display d
ii xserver-xorg-video-intel-dbg 2:2.6.3-0ubuntu2 X.Org X server -- Intel i8xx, i9xx display d

Bryce Harrington (bryce)
description: updated
summary: - X crashes EVERY time I try to switch ttys
+ [Mobile 4 IGC] X exits when switching ttys on 2.6.3
Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Mobile 4 IGC] X exits when switching ttys on 2.6.3

> I've had MANY X related issues today since a recent package update on Jaunty

Can you tell which package update led to the problem? View your /var/log/dpkg.log to see exactly what changed (e.g. grep upgrade /var/log/dpkg.log). I would probably guess it is xserver-xorg-video-intel 2:2.6.3-0ubuntu2, but there's been other X updates recently as well.

See if downgrading the suspected package(s) makes the issue go away. Usually the older debs are still available in your /var/cache/apt/archives/ for some time.

"WARNING: gdm_slave_xioerror_handler: Fatal X error" means X has either crashed or terminated. From your Xorg log I do not see a backtrace, which is typically printed out on crashes. Termination error messages typically get printed to a log file in /var/log/gdm/. Can you see if that is the case, and if so attach an example gdm log with the error?

Alternatively, shut down gdm and run X via `startx` from the console, and after getting it to terminate again, see what gets printed out to stderr.

The errors about vblank counts I believe are innocuous warnings.

This message is curious, one I've not seen before, and appears right before the gdm X termination message:

Mar 19 22:41:51 laptop x-session-manager[10605]: WARNING: Detected that screensaver has left the bus

That makes it sounds like screensaver stuff is somehow associated in the vt switching? Odd.

While I think this is a termination, if it is an X crash, we would next need a backtrace. Directions for getting one are at http://wiki.ubuntu.com/X/Backtracing - this won't help if X is terminating though.

Finally, thanks for including the lspci output. What I actually need is 'lspci -vvnn | grep -A1 "VGA "' as that will provide the PCI ID numbers for the video card.

Changed in xorg (Ubuntu Jaunty):
importance: Critical → High
status: Confirmed → Incomplete
Revision history for this message
Robbie Williamson (robbiew) wrote :

# lspci -vvnn | grep -A1 "VGA "
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07)
 Subsystem: Lenovo Device [17aa:20e4]

Revision history for this message
Robbie Williamson (robbiew) wrote :

Had to do some tricks over an ssh connection to even get to a console...since I couldn't switch ttys :). Anyway...I attached a gdb session to the X server and recreated the problem. After typing 'continue', I hit ctrl-alt-f1 and was given the (gdb) prompt on a SIGUSR1. The screen was simply black. I pulled the backtrace, register, and thread info, then ran 'continue'. The screen then went to the weird ascii stuff and I was sent back to the (gdb) prompt on a SIGABRT. I again pulled the backtrace, register, and thread info. The program terminated after I continued.

NOTE: I found that I couldn't get any symbol table info on libdrm_intel.so.1, unless I relinked it to /usr/lib/debug/usr/lib/libdrm_intel.so.1.0.0 before trying to switch ttys. However, I had to correct the link after the crash, or my X would not start properly.

Revision history for this message
kdawgud (kleber) wrote : Re: [iGM45] X exits when switching ttys on 2.6.3

I am seeing a problem (which may be the same). For me, X on Jaunty (32-bit) does not exit any time I switch to TTY, but it does quit the 2nd graphical session I create when I switch back and forth between users.

If I log in as myself, that session is always fine.

If I then log in as another user (with a separate desktop), that log in works fine too.

However, when I switch back to my original user, the 2nd X session closes.

This happens whether I use the ubuntu user selector, or I jump back to the first session using CTRL-ALT-F7.

Here is what I get in syslog:

Mar 20 21:15:31 laptop gdm[7411]: WARNING: gdm_slave_xioerror_handler: Fatal X error - Restarting :20
Mar 20 21:15:31 laptop acpid: client connected from 6535[0:0]
Mar 20 21:15:33 laptop x-session-manager[6576]: WARNING: Unable to find watch for alarm 12582917
Mar 20 21:15:33 laptop x-session-manager[6576]: WARNING: Unable to find watch for alarm 12582918
Mar 20 21:15:41 laptop kernel: [ 1720.258098] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0

lspci:
00:00.0 Host bridge: Intel Corporation Mobile 4 Series Chipset Memory Controller Hub (rev 07)
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
00:1a.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #5 (rev 03)
00:1a.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #6 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 2 (rev 03)
00:1c.2 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 3 (rev 03)
00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #3 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801I (ICH9 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 93)
00:1f.0 ISA bridge: Intel Corporation ICH9M LPC Interface Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation ICH9M/M-E SATA AHCI Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 03)
09:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8040 PCI-E Fast Ethernet Controller (rev 13)
0c:00.0 Network controller: Intel Corporation Wireless WiFi Link 5100

Revision history for this message
kdawgud (kleber) wrote :

I found I can also get the X server to crash by doing the following:

Log in as primary user
Choose "Lock screen" option
Choose "switch user"
Once at login screen, hit CTRL-ALT-F2.

The x server crashes with the same syslog error:
gdm[20849]: WARNING: gdm_slave_xioerror_handler: Fatal X error - Restarting :20

Changed in xserver-xorg-video-intel:
status: Incomplete → Triaged
Revision history for this message
Robbie Williamson (robbiew) wrote :

This issue seems similar to bug 128207.

Revision history for this message
Robbie Williamson (robbiew) wrote :

I can switch ttys without issue if I add "Option NoAccel" to the Device section of xorg.conf and disable compiz.

Revision history for this message
vidarino (vidarino) wrote :

I'm also experiencing exactly the same problem on my Lenovo N200 3000; When a second user logs in, then switches ttys (with User Switcher or Ctrl+Alt+Fx, the second user's session crashes (SIGABRT).

$ lspci -vvnn | grep -A1 "VGA "
00:02.0 VGA compatible controller [0300]: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller [8086:2a02] (rev 0c)
        Subsystem: Lenovo Device [17aa:383e]

/var/log/apport.log report:
apport (pid 18709) Tue Mar 24 19:58:07 2009: called for pid 18314, signal 6
apport (pid 18709) Tue Mar 24 19:58:07 2009: executable: /usr/bin/Xorg (command line "/usr/X11R6/bin/X :20 -br -audit 0 -auth /var/lib/gdm/:20.Xauth vt9")
apport (pid 18709) Tue Mar 24 19:58:07 2009: Ignoring SIGABRT

Revision history for this message
vidarino (vidarino) wrote :

Oh, and adding Option "NoAccel" does indeed work around the problem. Thanks for the tip, Robbie.

Revision history for this message
Steve Langasek (vorlon) wrote :

Robbie,

Could you check whether you're able to run X using the recipe at https://wiki.ubuntu.com/X/DebuggingWithValgrind, and if so, forward the contents of /var/log/Xorg-valgrind.log after running a VT switch?

Revision history for this message
Robbie Williamson (robbiew) wrote :
Revision history for this message
Robbie Williamson (robbiew) wrote :
Revision history for this message
Robbie Williamson (robbiew) wrote :
Revision history for this message
Steve Langasek (vorlon) wrote :

Did X crash in this case on the VT switch? The valgrind output doesn't seem to show any problems. :/

Revision history for this message
Robbie Williamson (robbiew) wrote :

Yes...it did. :/

Bryce Harrington (bryce)
summary: - [iGM45] X exits when switching ttys on 2.6.3
+ [iGM45] X exits when switching ttys on 2.6.3 unless NoAccel used
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for the backtrace Robbie, that helps pinpoint where things are failing. The failure seems to be happening in the kernel itself. The last point it is in X-land is with this stanza of code:

    case ACTION_SWITCHSCREEN_PREV:
        if (VTSwitchEnabled && !xf86Info.dontVTSwitch && xf86Info.vtno > 0) {
            if (ioctl(xf86Info.consoleFd, VT_ACTIVATE, xf86Info.vtno - 1) < 0)
                ErrorF("Failed to switch consoles (%s)\n", strerror(errno));
        }
        break;

The ioctl calls into the kernel from there, but we have no further debug info as to what it's doing there:

#0 0xb7fdc424 in __kernel_vsyscall ()
No symbol table info available.
#1 0xb7c6ad59 in ioctl () from /lib/tls/i686/cmov/libc.so.6
No locals.

Some things I can rule _out_:

1. "*ERROR* trying to get vblank count" - it sounds bad, but is a common warning unrelated to this issue

2. In the second backtrace there is a "*** glibc detected ***" bit - I think this portion of the crash is actually just bug #328035 (which was causing a lot of glibc X failures all over the place). It is now fixed (post-beta) in xorg-server (2:1.6.0-0ubuntu5)

3. The crash on switching user is a different bug not related to this.

affects: xserver-xorg-video-intel (Ubuntu Jaunty) → linux (Ubuntu Jaunty)
Changed in linux (Ubuntu Jaunty):
status: Triaged → New
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Robbie,

Can you comment which version of the kernel you're running - `cat /proc/version_signature`. After you are able to reproduce the bug, could you ssh into the machine, capture your dmesg output and attach it here? Just curious if there might be any additional error messages being logged from the kernel that could be helpful. Thanks.

Revision history for this message
Robbie Williamson (robbiew) wrote :

Hey Leann,

/proc/version_signature: Ubuntu 2.6.28-11.38-server

All I see in dmesg is this:
[drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0

I'm going to enable debug on the drm module to see if I get anything more...be back after my next crash. :/

Revision history for this message
Robbie Williamson (robbiew) wrote :
Download full text (4.1 KiB)

With debug enabled on drm module:

Mar 31 10:21:31 laptop kernel: [84734.181641] [drm:drm_ioctl] pid=7268, cmd=0x4004644d, nr=0x4d, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181658] [drm:drm_ioctl] pid=7268, cmd=0x40086414, nr=0x14, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181664] [drm:gm45_get_vblank_counter] *ERROR* trying to get vblank count for disabled pipe 0
Mar 31 10:21:31 laptop kernel: [84734.181669] [drm:drm_irq_uninstall] irq=2298
Mar 31 10:21:31 laptop kernel: [84734.181687] [drm:drm_ioctl] pid=7268, cmd=0xc0046444, nr=0x44, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181692] [drm:i915_emit_irq]
Mar 31 10:21:31 laptop kernel: [84734.181695] [drm:drm_ioctl] pid=7268, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181700] [drm:i915_wait_irq] irq_nr=27262619 breadcrumb=27262619
Mar 31 10:21:31 laptop kernel: [84734.181869] [drm:drm_ioctl] pid=7268, cmd=0x40186443, nr=0x43, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181873] [drm:i915_batchbuffer] i915 batchbuffer, start 787000 used 8 cliprects 0
Mar 31 10:21:31 laptop kernel: [84734.181879] [drm:drm_ioctl] pid=7268, cmd=0xc0046444, nr=0x44, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181882] [drm:i915_emit_irq]
Mar 31 10:21:31 laptop kernel: [84734.181960] [drm:drm_ioctl] pid=7268, cmd=0xc0046444, nr=0x44, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181964] [drm:i915_emit_irq]
Mar 31 10:21:31 laptop kernel: [84734.181968] [drm:drm_ioctl] pid=7268, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.181971] [drm:i915_wait_irq] irq_nr=27262622 breadcrumb=27262622
Mar 31 10:21:31 laptop kernel: [84734.181978] [drm:drm_ioctl] pid=7268, cmd=0x4004644d, nr=0x4d, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.519972] [drm:drm_ioctl] pid=7268, cmd=0x4004644d, nr=0x4d, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.519979] [drm:drm_ioctl] pid=7268, cmd=0x40086408, nr=0x08, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.519985] [drm:drm_vblank_get] enabling vblank on crtc 1, ret: 0
Mar 31 10:21:31 laptop kernel: [84734.519988] [drm:drm_update_vblank_count] enabling vblank interrupts on crtc 1, missed 20
Mar 31 10:21:31 laptop kernel: [84734.550387] [drm:drm_ioctl] pid=7268, cmd=0x4004644d, nr=0x4d, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.762731] [drm:drm_ioctl] pid=7268, cmd=0xc0046444, nr=0x44, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.762736] [drm:i915_emit_irq]
Mar 31 10:21:31 laptop kernel: [84734.762740] [drm:drm_ioctl] pid=7268, cmd=0x40046445, nr=0x45, dev 0xe200, auth=1
Mar 31 10:21:31 laptop kernel: [84734.762744] [drm:i915_wait_irq] irq_nr=27262623 breadcrumb=27262623
Mar 31 10:21:31 laptop kernel: [84734.938756] [drm:drm_vm_close] 0xa278b000,0x009c7000
Mar 31 10:21:31 laptop kernel: [84734.938761] [drm:drm_vm_close] 0xa3152000,0x009c7000
Mar 31 10:21:31 laptop kernel: [84734.938764] [drm:drm_vm_close] 0xa3b19000,0x009c7000
Mar 31 10:21:31 laptop kernel: [84734.938770] [drm:drm_vm_close] 0xa49cb000,0x02000000
Mar 31 10:21:31 laptop kernel: [84734.938773] [drm:drm_vm_close] 0xa6...

Read more...

Changed in linux (Ubuntu Jaunty):
status: New → Triaged
Revision history for this message
Robbie Williamson (robbiew) wrote :

Did a lot of debugging yesterday using the latest libdrm git tree, and the problem appears to be in /libdrm/intel/intel_bufmgr_fake.c. Specifically around the drm_intel_fake_bo_unreference_locked function (line 873). The function has a "funky" for loop that calls itself, and I'm pretty sure it's not properly checking all the blocks it attempts to free....thus freeing up memory it shouldn't. I was actually able to switch VTs after adding some memory checks and tweaking the loop, but there's still a memory "stomping" issue in the function...as the X session will simply hang and issue a backtrace after a few minutes of activity. Will try to continue to debug today. I will also check out the git tree history of this file as well.

Revision history for this message
Steve Langasek (vorlon) wrote :

kernel team's impression is that this is a libdrm bug, so bouncing back over there; please clarify if this is incorrect.

affects: linux (Ubuntu Jaunty) → libdrm (Ubuntu Jaunty)
Martin Pitt (pitti)
Changed in libdrm (Ubuntu Jaunty):
assignee: nobody → canonical-desktop-team
Revision history for this message
Robbie Williamson (robbiew) wrote :

I "suspect" the bug is in the function below. Note the 'for' loop that calls itself...yuck!
-----------------------------------------------------------------------

static void
drm_intel_fake_bo_unreference_locked(drm_intel_bo *bo)
{
   drm_intel_bufmgr_fake *bufmgr_fake = (drm_intel_bufmgr_fake *)bo->bufmgr;
   drm_intel_bo_fake *bo_fake = (drm_intel_bo_fake *)bo;
   int i;

   if (--bo_fake->refcount == 0) {
      assert(bo_fake->map_count == 0);
      /* No remaining references, so free it */
      if (bo_fake->block){
         free_block(bufmgr_fake, bo_fake->block, 0);
      }
      free_backing_store(bo);

      for (i = 0; i < bo_fake->nr_relocs; i++){
         drm_intel_fake_bo_unreference_locked(bo_fake->relocs[i].target_buf);
      }
      DBG("drm_bo_unreference: free buf %d %s\n", bo_fake->id, bo_fake->name);

      free(bo_fake->relocs);
      free(bo);
   }
}

Changed in libdrm (Ubuntu Jaunty):
assignee: canonical-desktop-team → bryceharrington
Revision history for this message
Bryce Harrington (bryce) wrote :

Hmm, I'm curious about this. This latest info doesn't match up with the backtrace you posted earlier, which seemed to be in XKB rather than drm.

#2 0x080c86f1 in xf86ProcessActionEvent (action=ACTION_SWITCHSCREEN, arg=0xbfff7d40)
#3 0x081c590b in XkbDDXSwitchScreen (dev=0xd4096d0, key=67 'C', act=0x5606)
#4 0x081adfae in _XkbFilterSwitchScreen (xkbi=0xd3cfda8, filter=0xdcf5450, keycode=67,

Could it be possible that you have been experiencing two different crashes? I recently put in a fix for a bad XKB bug that was causing a couple other crashes (which did not seem related to your case, but who knows). Also, we have been tracking a crash in drm_intel_fake_bo_unreference_locked() on another report (bug 348428).

Perhaps the original bug is now resolved, and you are instead seeing bug 348428 now? What do you think?

Revision history for this message
Robbie Williamson (robbiew) wrote :

I think your right. I still cannot switch VTs, but after upgrading to the latest development packages, I noticed that I no longer get the weird ascii characters. Now I see a glimpse of the vt, before it goes black, X restarts, and I'm presented with a login. Should we dup this to 348428?...or visa versa?

Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Robbie,

Okay sounds good. Meanwhile, I sussed out 348428. I believe it is attributable to a fix I'd put in for bug 344740, which I was a bit uncertain about at the time, although the patches looked innocuous enough; but looks like my gut was right. I've backed out one of the two patches I suspect may be the culprit. Please re-test to verify this fixed the issue; if not I will back out the other patch too. Hopefully that puts this bug to rest.

Revision history for this message
Robbie Williamson (robbiew) wrote :

Problem solved! Thanks Bryce.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.