M6 LY is corrupt/crashes after suspend/resume, need AGPMode 2

Bug #248438 reported by Andrew Melo
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xserver-xorg-driver-ati
Fix Released
Medium
xserver-xorg-video-ati (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: xserver-xorg-video-ati

A recent (last month or two) driver update causes the M6 LY to behave weirdly when suspending/resuming.

On resume, graphical errors pop up. Horizontal regions (I'd say 10-20 pixels long and 1 pixel high) begin to appear when regions are redrawn (text in consoles scrolling/windows resizing). Additionally, regions appear in some windows where (moving left to right and top to bottom) the first 100 or so pixels are scrambled followed by a black region. Resizing the windows makes the affected regions disappear, but they are replaced by new glitches. It also seems like hardware mouse acceleration is effected. The mouse moves with a .5sec refresh rate after restarting.

Eventually, the glitches get more and more common until the machine locks hard (no ctrl+alt+backspace, caps lock key is unresponsive).

Xorg.log doesn't have any errors, so I'm at a loss to figure out where to look. Looking at lspci -vv, it seems like there might be a problem with the memory detection routines? There is only 16MB of video ram on this machine, and 128MB is reported (region 0). If it was including AGP memory, it would be in powers-of two and 116MB isn't.

Any suggestions? The attached log shows me booting, suspending/resuming and switching to/from a VT. When I switched to the VT, nothing came up. I tried upgrading my ati driver with a version from https://launchpad.net/~tormodvolden/+archive , to no effect.

$ lspci -vv
1:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY (prog-if 00 [VGA controller])
 Subsystem: Dell Unknown device 00e3
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop+ ParErr- Stepping+ SERR+ FastB2B-
 Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
 Latency: 32 (2000ns min), Cache Line Size: 32 bytes
 Interrupt: pin A routed to IRQ 11
 Region 0: Memory at e0000000 (32-bit, prefetchable) [size=128M]
 Region 1: I/O ports at c000 [size=256]
 Region 2: Memory at fcff0000 (32-bit, non-prefetchable) [size=64K]
 [virtual] Expansion ROM at fc000000 [disabled] [size=128K]
 Capabilities: [58] AGP version 2.0
  Status: RQ=48 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW- AGP3- Rate=x1,x2,x4
  Command: RQ=32 ArqSz=0 Cal=0 SBA+ AGP+ GART64- 64bit- FW- Rate=x4
 Capabilities: [50] Power Management version 2
  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Revision history for this message
Andrew Melo (andrew-melo) wrote :
Revision history for this message
Tormod Volden (tormodvolden) wrote :

Can you please try with the device Option "BusType" PCI"? See otherwise if you can adjust AGP parameters in BIOS. Or try Option "AGPMode" "1" (or "2", default is "4").

Changed in xserver-xorg-video-ati:
assignee: nobody → tormodvolden
status: New → Incomplete
Revision history for this message
Andrew Melo (andrew-melo) wrote :

Tormod:

AGPMode 2 makes it work fine. It worked in the past though, so I think this is a regression.

Changed in xserver-xorg-video-ati:
assignee: tormodvolden → nobody
status: Incomplete → Confirmed
Revision history for this message
In , Bryce Harrington (bryce) wrote :

Forwarding bug report from a Ubuntu user:
https://bugs.edge.launchpad.net/ubuntu/+source/xserver-xorg-video-ati/+bug/248438

On the mobility M6 LY, after doing a suspend/resume, graphical errors start accumulating, and eventually locks up.

Specifying Option "AGPMode" "2" makes the problems go away.

Original Report:
"A recent (last month or two) driver update causes the M6 LY to behave weirdly when suspending/resuming.

On resume, graphical errors pop up. Horizontal regions (I'd say 10-20 pixels long and 1 pixel high) begin to appear when regions are redrawn (text in consoles scrolling/windows resizing). Additionally, regions appear in some windows where (moving left to right and top to bottom) the first 100 or so pixels are scrambled followed by a black region. Resizing the windows makes the affected regions disappear, but they are replaced by new glitches. It also seems like hardware mouse acceleration is effected. The mouse moves with a .5sec refresh rate after restarting.

Eventually, the glitches get more and more common until the machine locks hard (no ctrl+alt+backspace, caps lock key is unresponsive).

Xorg.log doesn't have any errors, so I'm at a loss to figure out where to look. Looking at lspci -vv, it seems like there might be a problem with the memory detection routines? There is only 16MB of video ram on this machine, and 128MB is reported (region 0). If it was including AGP memory, it would be in powers-of two and 116MB isn't.

Any suggestions? The attached log shows me booting, suspending/resuming and switching to/from a VT. When I switched to the VT, nothing came up. I tried upgrading my ati driver with a version from https://launchpad.net/~tormodvolden/+archive , to no effect.

$ lspci -vv
1:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY (prog-if 00 [VGA controller])
"

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created an attachment (id=18585)
Xorg.0.log.old

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati:
status: Confirmed → Triaged
Revision history for this message
Bryce Harrington (bryce) wrote :

Hi Andrew,

I've forwarded your bug upstream. Can you please subscribe to this bug, in case upstream needs additional information or wishes for you to test something:
https://bugs.freedesktop.org/show_bug.cgi?id=17360

Meanwhile, can you also run a few more tests?

0. Attach the working /etc/X11/xorg.conf with the AGPMode 2 setting.

1. I assume you've already tested against Intrepid Alpha-4; if not, please do this soon.

2. After the system has locked up, can you ping the box? If so, can you ssh into it? If so, please collect a backtrace (see http://wiki.ubuntu.com/X/Backtracing for directions). If you can't, it's possible this is a kernel bug.

Changed in xserver-xorg-driver-ati:
importance: Undecided → Unknown
status: New → Unknown
Changed in xserver-xorg-driver-ati:
status: Unknown → Confirmed
Revision history for this message
In , Michel-tungstengraphics (michel-tungstengraphics) wrote :

What happens if Option "AGPMode" isn't specified at all?

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Specifying no value results in the errors and lockup.

I believe the user also tested AGPMode 1 and 4 (default), and only found 2 to work for this hardware.

Revision history for this message
In , agd5f (agd5f) wrote :

Unfortunately, this is one of the problems with AGP. Certain chip/bridge combinations only only work reliably at certain speeds. Most AGP bridges are busted in one way or another. We've been through this several times in the radeon driver (what to pick for default AGP mode).

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Hi Alex,

Could we set up quirks for handling it?

Revision history for this message
In , Bugzi09-fdo-tormod (bugzi09-fdo-tormod) wrote :

How does fglrx deal with this? Quirks?

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Dunno how -fglrx handles it, but there are situations sort of akin to this in -intel that are handled by quirks, implemented in xf86-video-intel/src/i830_quirks.c. There is a 'quirk_flag' added to the pI830 structure, and then quirks are applied on a per-PCIID basis in the i830_quirk_list[] structure.

This approach is nice from the distro standpoint because as we run across new hardware requiring that particular tweak, we can add support for that HW with one line of (non-executable) code - which is also easy to justify to folks for backporting as well.

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created an attachment (id=18620)
Example quirk system adapted from -intel

Here's a rough sketch of what I'm thinking about (borrowed from the -intel source). A few defines would also need added to radeon.h, and of course at the point where the AGPMode default is selected, it'd need to test for the quirk. If this looks like something that might be acceptable, let me know and I can clean it up into a proper patch... Or maybe there's a better approach I've missed?

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created an attachment (id=18623)
Add quirk system for AGPMode

Here's a stab at adding a list 'o quirks to radeon_dri.c.

I decided to check against subsys as well as the chip and hostbridge since it sounds like these bugs are extraordinarily hardware specific.

The user can still override whatever we select as a default. So if they want to fiddle with AGP Mode settings in BIOS or whatever, they can still force it to whatever they desire in xorg.conf.

Let me know what you think of this.

Revision history for this message
Bryce Harrington (bryce) wrote :

Nevermind; I have confirmation from upstream that this is a known issue.

Revision history for this message
In , Brice Goglin (brice-goglin) wrote :

If we're going there, here's what I found in Debian bugs:
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=467460
  Mobility 9600 M10 RV350 needs AGPMode 1
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=462590
  ATI Technologies Inc M18 JN [Radeon Mobility 9800] needs AGPMode 4
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=467235
  ATI Technologies Inc Radeon Mobility M6 LY needs AGPMode 1
* http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=461144
  R200 QM [Radeon 9100, 1002,514d] needs AGPMode 4

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Here's what we'd add to the driver's quirks table:

    /* Intel 82852/82855 host bridge / Mobility 9600 M10 RV350 Needs AGPMode 1 (deb #467460) */
    { PCI_VENDOR_INTEL,0x3580, PCI_VENDOR_ATI,0x4e50, PCI_VENDOR_ACER,0x0061, 1 },

    /* Intel 82865G/PE/P DRAM Controller/Host-Hub / Radeon Mobility 9800 Needs AGPMode 4 (deb #462590) */
    { PCI_VENDOR_INTEL,0x2570, PCI_VENDOR_ATI,0x4a4e, PCI_VENDOR_DELL,0x5106, 4 },

    /* VIA VT8377 Host Bridge / R200 QM [Radeon 9100] Needs AGPMode 4 (deb #461144) */
    { 0x1106,0x3189, PCI_VENDOR_ATI,0x514d, 0x174b,0x7149, 4 },

Debian #467235 didn't have lspci output or an Xorg.0.log attached, so I dunno what the host bridge is in that case.

Revision history for this message
In , Brice Goglin (brice-goglin) wrote :

<email address hidden> wrote:
> --- Comment #11 from Bryce Harrington <email address hidden> 2008-09-02 14:43:54 PST ---
>
> Debian #467235 didn't have lspci output or an Xorg.0.log attached, so I dunno
> what the host bridge is in that case.
>

Should be
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=32;filename=Xorg.0.log;att=1;bug=467235

Brice

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created an attachment (id=18681)
Update to include a couple more systems

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xserver-xorg-video-ati - 1:6.9.0+git20080826.a3cc1d7a-2ubuntu1

---------------
xserver-xorg-video-ati (1:6.9.0+git20080826.a3cc1d7a-2ubuntu1) intrepid; urgency=low

  * debian/control: Reduce xorg-server Build-Depends version to 1.4.99 (FTBS)
  * 100_quirk_system.patch: Adds a quirk system for setting specific
    AGPMode values for particular hardware combinations. See
    https://wiki.ubuntu.com/X/Quirks for details about this.
    (LP: #248438)
  * Modify Maintainer value to match the DebianMaintainerField
    specification.

 -- Bryce Harrington <email address hidden> Fri, 05 Sep 2008 18:47:50 -0700

Changed in xserver-xorg-video-ati:
status: Triaged → Fix Released
Revision history for this message
Bryce Harrington (bryce) wrote :

I *think* this should fix it. Please test, and reopen the bug if the issue remains.

Revision history for this message
Andrew Melo (andrew-melo) wrote :

I'm going to upgrade to intreped tonight. Do you still need the backtraces if the server works okay? Do you want me to test with AGPMode=4 again?

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 248438] Re: M6 LY is corrupt/crashes after suspend/resume, need AGPMode 2

On Sat, Sep 06, 2008 at 04:55:07AM -0000, Andrew Melo wrote:
> I'm going to upgrade to intreped tonight. Do you still need the
> backtraces if the server works okay? Do you want me to test with
> AGPMode=4 again?

Nope, no need for backtraces if it all works okay. Testing with
AGPMode=4 would be nice though.

Bryce

Revision history for this message
In , agd5f (agd5f) wrote :

committed: 937b7ac2a259cf504a19dcf62a58b1db1afb8eb9

Thanks!

Changed in xserver-xorg-driver-ati:
status: Confirmed → Fix Released
Changed in xserver-xorg-driver-ati:
importance: Unknown → Medium
Changed in xserver-xorg-driver-ati:
importance: Medium → Unknown
Changed in xserver-xorg-driver-ati:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.