bad irq stuff (irqpoll) happening on asus p4p800se after upgrade to 2.6.20-16

Bug #117447 reported by Simon Oosthoek
4
Affects Status Importance Assigned to Milestone
linux-source-2.6.20 (Ubuntu)
Invalid
High
Unassigned

Bug Description

Binary package hint: kernel-image-2.6.20-16-386-di

I'm not sure exactly what happened, but I think I had the following situation happening:
long uptime of kubuntu feisty with regular updates, no rebooting. So I skipped the 2.6.20-15 upgrade and went straight from 2.6.17 to 2.6.20-16. I'm using the nvidia restricted module.

This is my "production" machine at home, it is always on and serves everything, including the music using slimserver (external package from slimdevices).

After reboot, the slimserver had trouble serving up the music, which is on a sata disk (I have 2 sata disks and one pata. the pata is root and home)

Then I had trouble finding the problem, thought it was a disk failing. I was very confused by the fact that my sd* disks were now gone and I only had hd* disks.

Rebooted while reducing the disk count, still errors

Couple of reboots later, I started to suspect my dvd burner (on secondary pata cable), so I removed that (and cleaned the case in the process) Removed all the hdds as well.

Only dvdburner with kubuntu edgy install dvd, boots, but can't mount the install "cdrom". So I figure the burner is broken somehow. Remove that, boot with only my pata disk and kubuntu bootsystem.

2.6.20-16 still gi ves problems, 2.6.17 is working, but no graphics (nvidia driver is no longer there for this kernel). All the disks are working now, no irqpoll messages.

so I try 2.6.20-15, which also works ok, it seems.

I have no idea what happened, but I suspect something changed in the irq handling code in the kernel between 2.6.20-15 and -16.

If more info is needed, I'll try to help, but I can't go around rebooting this machine constantly.

Revision history for this message
Simon Oosthoek (simon-margo) wrote :

Just added the dvdburner again, and it's working fine.
kernel is: 2.6.20-15-386

I did notice that the -16 kernel also had hyperthreading turned on (again).

$ lspci
00:00.0 Host bridge: Intel Corporation 82865G/PE/P DRAM Controller/Host-Hub Interface (rev 02)
00:01.0 PCI bridge: Intel Corporation 82865G/PE/P PCI to AGP Controller (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c2)
00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) AC'97 Audio Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)
02:05.0 Ethernet controller: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13)

Is my hardware so obscure?

Cheers

Simon

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Thanks for your bug report.

Simon:
Looks like an issue has cropped up with the sata drivers for your motherboard in 2.6.20-16. See Bug #116996 for more details...

Revision history for this message
LarsBjerregaard (lars-rubyglow) wrote :

The unofficial survey going on here: http://ubuntuforums.org/showthread.php?t=456662&page=20 starting post#195 might yeld some clues, and it would seem there's a lot of folks with Intel ICH4 and ICH5 there.

Revision history for this message
LarsBjerregaard (lars-rubyglow) wrote :
Revision history for this message
Andrew Waldram (andrew-waldram) wrote :

This poll is worrying me ... All it proves is a lot of people have ich4 and ich5 chip motherboards .. nothing else.

for the record
00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03)

and yes I do have the rename issue

Revision history for this message
Andrew Waldram (andrew-waldram) wrote :

I think its more likley that any chipset that was using libata is affected.

I've not seen any sata or libata (ata_piix) users who haven't had issues.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Andrew:
I have - me! The problem appears to be ICH4/ICH5 users (and if your comment is right ICH6 too). I have an ICH7 and it's still using ata_piix. It would be nice to see a different _ICH6_ owner confirm things over in Bug #116996 so we are clear on exactly which controllers are affected by the ata_piix disabling. While I'm here can you indicate whether you suffer the IRQ timeout/disabled issue too?

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

The "broken cdrom" issue has been spun off into Bug #117413 .

Revision history for this message
Andrew Waldram (andrew-waldram) wrote :

 Sitsofe Wheeler

No my only issue was the renaming of my drives .

The machine is perfectly stable on 2.6.20-16.28 though i did get the ghost cd rom (fstab entry as drive moved from hdc to hdb)

What would you like to prove that my ich6 was affected.??

Its difficult to know which bug to post in as they have been amended (ich4/5) which ain't me

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Andrew:
You've done all you can - you've followed up and posted in all the right bugs. We just need a second ICH6 to come forward and say "me too" so that Bug #116996 can be amended (and possibly to split into a new bug because some of the folks in there are suffering from IRQ issues in addition to the "great rename").

Revision history for this message
LarsBjerregaard (lars-rubyglow) wrote :

I think klmonz in the appended bug-description at the top of bug https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/116996, hits the nail on the head.

Reading through the extensive thread in http://ubuntuforums.org/showthread.php?t=456662, it seems obvious, that Debian bug http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=419458 describes the core problem with disks, which is the same as *this* bug.

This is FIXED in Debian! I'm sorry, but I have to say that I find it disheartening, that bug https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/116996 is still unconfirmed+undecided, bug https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/117447 is unconfirmed+undecided, and bug https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/117314 is unconfirmed, though recognized as status high.

Please devs! You are probably overburdened, that's understood. BUT.... this bug is wrecking the systems of a hell of a lot of users, and if far worse than the X-oops update you released some time ago. This one is GRAVE, as the Debian bug correctly states. Please please... fix this. Thank you.

Changed in linux-source-2.6.20:
assignee: nobody → phillip-lougher
status: Unconfirmed → In Progress
importance: Undecided → High
Revision history for this message
Simon Oosthoek (simon-margo) wrote :

nearly a month further, I'm still on 2.6.20-15...

At least the priority is now high :-)

/Simon

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Simon:
A new kernel (2.6.20-16.29) reverting the piix changes was released: http://www.ubuntu.com/usn/usn-470-1 on 08 June 2007. Additionally a new wiki page describing how references to partitions should be UUIDs/labels has recently appeared: https://wiki.ubuntu.com/UsingUUID . Are things any better in 2.6.20-16.29?

Changed in linux-source-2.6.20:
status: In Progress → Incomplete
Revision history for this message
Simon Oosthoek (simon-margo) wrote :

I did notice and install a kernel-image update. For some reason I can't find the specific version of the currently installed kernel-image package (I don't want to spend time finding out how). Anyway, the update didn't fix my problem, but it may have been due to not using labels, but /dev/sdb1 directly. I'll check later whether this will fix it.

(rantmode: regardless, though I understand the need for independence of disk order, it does make things more complicated when editing the fstab directly. And the GUI tool is unclear in its use of the term "enable", and didn't work for me (which is why I was editing the table in the first place)

Revision history for this message
Simon Oosthoek (simon-margo) wrote :

After changing the kernel, fixing the nvidia driver problems I got and finding the right UUID's, I figure it all works now.

Apologies for the implied accusations of tardyness and my lazyness after the .29 update.

I hope the others who posted here also have their problems fixed now.

/Simon

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Simon:
Lars has already posted a response saying the problem is solved at the bottom of Bug #117314 . Andrew is a stalwart of the "great rename" and has already said his particular hardware was OK in https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/116996/comments/75 .

Thanks for taking the time to report this bug and helping to make Ubuntu better. This particular bug has already been reported and is a duplicate of bug #117314 and is being marked as such. Please feel free to continue to report any other bugs you may find.

Changed in linux-source-2.6.20:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.