6.06LTS: Computer locks up with nvidia proprietary driver on SMP system

Bug #132400 reported by Erik Devriendt
2
Affects Status Importance Assigned to Milestone
linux-restricted-modules-2.6.15 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: nvidia-glx

OS: Ubuntu 6.06LTS up to date with the current repositories
Kernel: 2.6.15-28-686 (SMP)
Computer: Fujitsu-Siemens R540 with Dual core Xeon processor
Video adapter: Nvidia Quadro NVS 440

This video adapter is only supported by the vesa driver (single screen) and the proprietary nvidia driver (all 4 screens).
When using the nvidia-glx 1.0.8776 package of 6.06LTS the GUI locks up after some random time. Ssh connections to the
PC still work, but restarting X sometimes makes the PC lock up completely.
The GUI lock up becomes more probable after having a OpenGL screen saver active for some times.

The var/log/kern.log contains following lines just before the lockup:
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] irq 185: nobody cared (try booting with the "irqpoll" option)
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [__report_bad_irq+42/160] __report_bad_irq+0x2a/0xa0
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [handle_IRQ_event+61/112] handle_IRQ_event+0x3d/0x70
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [note_interrupt+135/240] note_interrupt+0x87/0xf0
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [__do_IRQ+253/272] __do_IRQ+0xfd/0x110
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [do_IRQ+25/48] do_IRQ+0x19/0x30
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [common_interrupt+26/32] common_interrupt+0x1a/0x20
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [mwait_idle+42/64] mwait_idle+0x2a/0x40
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [cpu_idle+111/192] cpu_idle+0x6f/0xc0
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [start_kernel+415/512] start_kernel+0x19f/0x200
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [unknown_bootoption+0/496] unknown_bootoption+0x0/0x1f0
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] handlers:
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [pg0+944546240/1069167616] (usb_hcd_irq+0x0/0x70 [usbcore])
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] [pg0+953752601/1069167616] (nv_kern_isr+0x0/0x64 [nvidia])
Aug 13 14:02:12 ubuntu kernel: [17180850.340000] Disabling IRQ #185
Aug 13 14:02:16 ubuntu kernel: [17180854.276000] NVRM: Xid (000a:00): 8, Channel 00000020
Aug 13 14:02:19 ubuntu kernel: [17180857.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b2
Aug 13 14:02:19 ubuntu kernel: [17180857.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011ac
Aug 13 14:02:28 ubuntu kernel: [17180866.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b3
Aug 13 14:02:28 ubuntu kernel: [17180866.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011ad
Aug 13 14:02:36 ubuntu kernel: [17180874.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b4
Aug 13 14:02:36 ubuntu kernel: [17180874.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011ae
Aug 13 14:02:44 ubuntu kernel: [17180882.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b5
Aug 13 14:02:44 ubuntu kernel: [17180882.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011af
Aug 13 14:02:52 ubuntu kernel: [17180890.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b6
Aug 13 14:02:52 ubuntu kernel: [17180890.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011b0
Aug 13 14:03:00 ubuntu kernel: [17180898.276000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b7
Aug 13 14:03:00 ubuntu kernel: [17180898.276000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011b1
Aug 13 14:03:08 ubuntu kernel: [17180906.292000] NVRM: Xid (000a:00): 16, Head 00000000 Count 000011b8
Aug 13 14:03:08 ubuntu kernel: [17180906.296000] NVRM: Xid (000a:00): 16, Head 00000001 Count 000011b2

The nvidia video adapter shares its IRQ with the USB drivers.
/proc/interrupts:
           CPU0 CPU1
  0: 18436501 54 IO-APIC-edge timer
  1: 1 8 IO-APIC-edge i8042
  8: 65 2 IO-APIC-edge rtc
  9: 0 0 IO-APIC-level acpi
 12: 140225 3 IO-APIC-edge i8042
 14: 1271342 1 IO-APIC-edge ide0
 90: 75913 1 IO-APIC-level libata, uhci_hcd:usb2
 98: 1393683 1 IO-APIC-level eth1
106: 75901 0 PCI-MSI eth0
177: 3613 1 IO-APIC-level uhci_hcd:usb4, HDA Intel
185: 4454030 1 IO-APIC-level uhci_hcd:usb3, nvidia
193: 4509386 1 IO-APIC-level uhci_hcd:usb1, ehci_hcd:usb5, nvidia
NMI: 0 0
LOC: 18436414 18436413
ERR: 0
MIS: 0

This lockup does not occur when using the UP 386 kernel (linux-image-2.6.15-28-386),but then, of course, only one
of both CPU cores is used, wasting 50% of the CPU power.
Installing the latest version (100.14.12 )of the nvidia driver from the nvidia website by means of 'envy' also solves the problem.

Since 6.06LTS is supposed to be supported till 2009 I would urge you to include the latest nvidia proprietary driver in the
next update of 6.06LTS.
We are planning to install this version (for its longer support period) on several PCs for our main customer. It would save us
quite some time when we would not have to install the latest nvidia driver each time manually.

See also Bug #121096, which asks for the same update for Feisty.

Revision history for this message
Erik Devriendt (erik-devriendt) wrote :

Since this bug is related to 6.06LTS (Dapper) the module version should be 2.6.15 instead of 2.6.22

Revision history for this message
Erik Devriendt (erik-devriendt) wrote :

Although Sarah Hobbs changed the affected module from linux-restricted-modules-2.6.15 to linux-restricted-modules-2.6.12, I think that was not correct since 2.6.12 is related to Breezy while my bug report is related to Dapper. That is why I set the affected module back to linux-restricted-modules-2.6.15.

Revision history for this message
Erik Devriendt (erik-devriendt) wrote :

Forgot to mention: the problem does NOT occur with Feisty. The generic kernel and the nvidia-glx of Feisty are OK.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

New binary releases always break something, so there won't be a new version for dapper anymore.

Changed in linux-restricted-modules-2.6.15:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.