Bug on amd64 makes X server unusable in Edgy (CRITICAL)

Bug #66500 reported by jmspeex
12
Affects Status Importance Assigned to Milestone
linux-source-2.6.17 (Ubuntu)
Fix Released
High
Unassigned
xorg-server (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I've recently attempted to install and64 Edgy on my laptop and I find it to be totally unusable in X (gnome or KDE). The X server will hang, crash, etc within minutes (if not seconds) of using it, usually forcing a reboot. The 32-bit version seems to work fine (haven't tested much, but if it was *that* broken, I would have noticed). By default, Edgy uses the i810 driver, but I also tried "vesa" and "intel" and it didn't work better (actually, "intel" didn't show anything on the screen).

Hardware:
Dell Latitude D820 laptop
CPU: Core 2 Duo 2 GHz, 2 GB RAM
Intel embedded 950 graphics

Software:
Edgy Beta + most recent updates (2006-10-16).
No binary-only video card driver (or similar crap)

Extra info:
Not sure if it's related (could be), but I've noticed the following things as well:
- Text console seems to work mostly fine, but I still had the ssh server crashing a few times
- /proc/cpuinfo shows both CPU cores in 64-bit mode, but the kernel seems to think they're running at 1 GHz (despite reporting a bogomips of 4000).
- Sometimes when I type, I get repeated keys

Revision history for this message
jmspeex (jean-marc-valin) wrote :

I have done some experiments with BIOS settings. Basically, the problem seems to disappear when I disable the second core. If any Ubuntu developer is interested in investigating the bug, please let me know as soon as possible because I'll otherwise be reformatting the partition as soon as FC6 is out (i.e. in a few days).

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Sorry, the problem doesn't disappear when I disable the second core, it just takes 20 minutes for the machine to crash instead of 1-2 minutes. Oh, and BTW the strange info in /proc/cpuinfo was simply due to cpufreq changing the CPU frequency.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Unsure which package

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Just summarising the info I have so far. I no longer think it's an X.Org bug. It *looks* like it could be a kernel problem because it affects many things. The result is that Edgy (beta and RC) is totally unusable on my machine, crashing/hanging after a few minutes. Other than the crashes, symptoms include:
- Sometimes hangs for 1-2 minutes during boot without any activity
- Sometimes keys I press on the keyboard get repeated
- gnome-panel animations are *very* slow.
- Most GUI applications freeze when I click on a button
- Some applications (e.g. ssh server) tend to crash very often
- Ping reports "Warning: time of day goes back, taking countermeasures"
- Similar time problems reported by the window manager (don't have the log anymore)

The above info makes it look like it could be related to gettimeofday() or something like that. Some more observations:
- Disabling one of the CPU cores in the BIOS makes the problem less frequent, but it doesn't go away (machine still crashes)
- I tried both the i810 and the vesa drivers and there's no difference
- The installer/LiveCD behaves in the same way as the installed distro

So far, I tested the following distros:
Dapper i386: Runs OK
Edgy i386: Runs OK
Dapper amd64: Runs OK
Edgy beta amd64: BUGGY
Edgy RC amd64: BUGGY
Debian Etch testing AMD64: BUGGY

My setup is:
Dell Latitude D820 laptop
CPU: Core 2 Duo 2 GHz, 2 GB RAM
Intel embedded 950 graphics
No binary-only video card driver (or similar crap)

Revision history for this message
jmspeex (jean-marc-valin) wrote :

I confirm that the final release of Edgy for AMD64 is still totally unusable on my machine. On the other hand, the 32-bit version doesn't have any of these problems (have been running it for one week).

Revision history for this message
jmspeex (jean-marc-valin) wrote :

It doesn't seem like it's an X server bug, please stop assigning it that way.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Some additional info:
Dapper still crashes after a while. The difference with Edgy is that Dapper is actually behaving normally before it crashes. I tried the Dapper kernel on Edgy, and it works better -- about the same as normal Dapper -- so it's probably a kernel bug. I also tried FC6 (2.6.18 kernel), which behaves about the same (weird X bugs) as Edgy. Still running 32-bit Edgy, which is still mostly fine (no bug making it unusable).

Revision history for this message
jmspeex (jean-marc-valin) wrote :

Turns out the notsc kernel option solves the strange X problems. Machine still crashes on any ACPI event though.

Revision history for this message
jmspeex (jean-marc-valin) wrote :

ACPI problem can be worked around by also using no_timer_check. I have checked newer kernels (2.6.18 and 2.6.19-rc6) and none of those seems to be affected. I'd recommend either back-porting the relevant patch or adding "notsc no_timer_check" to the default boot options (is that possible?). From my understanding of the problem (all posts I found on the net). All Dell laptops with a Core 2 Duo CPU are affected and it seems likely that this extends beyond Dell. The problem happens only with the AMD64 version, not the i386 version.
Relevant posts (without which I wouldn't be able to run Edgy):
http://ubuntuforums.org/showthread.php?t=285683
http://ubuntuforums.org/showthread.php?s=bc6fa3c48c258f8d29afa7d90d7d8acc&t=296354
http://doc.gwos.org/index.php/Double_Clock_Speed
I think there should be enough info in those posts to confirm that the bug is real and affects lots of people.

Revision history for this message
Bryce Harrington (bryce) wrote :

De-assigning from X, as per the bug reporter

Changed in xorg-server:
status: Unconfirmed → Rejected
Revision history for this message
Jérôme Guelfucci (jerome-guelfucci-deactivatedaccount) wrote :

Do you still have this issue with the latest release of Ubuntu ?

Changed in linux-source-2.6.17:
importance: Undecided → High
status: Unconfirmed → Needs Info
Revision history for this message
jmspeex (jean-marc-valin) wrote :

Feisty boots fine on a Core2 machine. It's only Edgy and Dapper (and probably earlier) that are completely broken.

Revision history for this message
Jérôme Guelfucci (jerome-guelfucci-deactivatedaccount) wrote :

Ok, I'm marking this as fixed. Thanks for answering.

Changed in linux-source-2.6.17:
status: Needs Info → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.