module insertion hangs in apic/irq setup

Bug #20943 reported by Alexander Jurjens
10
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

i386 based system hangs at "Starting hotplug subsystem..." (Hoary). There is
probably an issue with detecting an GeForce 6600. Without the GeForce 6600 Hoary
boots perfectly with Intel Extreme Graphics onboard IC. I also made an attempt
to disable the Intel IC in the BIOS (Primary Graphics Adapter --> AGP, no memory
for onboard IC allocated...), but that didn't give any positive results. I also
tried to boot Breezy with the GeForce 6600, but I receive a kernel panic at
"Starting hotplug subsystem...". Before the kernel panicked I saw that it tries
to do "do_irq" (Often, not all the time). Some people told me that Ubuntu
Hoary/Breezy tried to detect the Intel IC and the GeForce 6600 "at the same
time", which isn't possible. Neither the BASH terminal nor X Server with GNOME
will be loaded.

Hardware:

Motherboard: ASRock P4i45GV
Processor: Celeron 2.4 GHz
Memory: 256 MB SDRAM
Graphics: Intel Extreme Graphics or XpertVision GeForce 6600 AGP 8X (AGI ASRock
interface)
Sound: onboard

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I would guess that the nvidia module is hanging on startup.

Revision history for this message
Daniel Stone (daniels) wrote :

Scott, is there any way we can get hotplug disabled or to simply not load a
specific module so we can try to do some debugging?

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Absolutely!

Add the module name to /etc/hotplug/blacklist

Revision history for this message
Daniel Stone (daniels) wrote :

Okay, Alexander, could you please add 'nvidia' to /etc/hotplug/blacklist,
restart, and do something like this:
$ sudo tail -f /var/log/dmesg &
$ sudo modprobe nvidia

and tell us if dmesg shows anything interesting while you're loading nvidia.

Revision history for this message
Matt Zimmerman (mdz) wrote :

The nvidia module, like other video drivers, is never loaded automatically by
hotplug

If it's being loaded here, it's probably because it's in /etc/modules.

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

(In reply to comment #4)
> Okay, Alexander, could you please add 'nvidia' to /etc/hotplug/blacklist,
> restart, and do something like this:
> $ sudo tail -f /var/log/dmesg &
> $ sudo modprobe nvidia
>
> and tell us if dmesg shows anything interesting while you're loading nvidia.

I've added nvidia to /etc/hotplug/blacklist, but it didn't solve the problem. I
don't think that the nvidia module is the cause of my problem.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Is this problem still reproducible in breezy?

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

(In reply to comment #7)
> Is this problem still reproducible in breezy?

Yes, this bug is still reproducible in Breezy Preview, even if I add 'nvidia' to
/etc/hotplug/blacklist.

I have a question: Is there a way to install nVIDIA drivers via a LiveCD? I've a
Morphix LiveCD and it came with nVIDIA drivers. By the way, the Morphix LiveCD
is (!) 2 years old and it boots perfectly.

Revision history for this message
Matt Zimmerman (mdz) wrote :

(In reply to comment #8)
> (In reply to comment #7)
> > Is this problem still reproducible in breezy?
>
> Yes, this bug is still reproducible in Breezy Preview, even if I add 'nvidia' to
> /etc/hotplug/blacklist.

That isn't expected to have any effect; hotplug will not load that module
regardless of whether it is listed in that file. It must be loaded explicitly
via /etc/modules if you want to use it.

> I have a question: Is there a way to install nVIDIA drivers via a LiveCD? I've a
> Morphix LiveCD and it came with nVIDIA drivers. By the way, the Morphix LiveCD
> is (!) 2 years old and it boots perfectly.

You can install the nvidia-glx module using the Ubuntu live CD if you wish.

So far, I see no indication whatsoever that your problem has anything to do with
the nvidia driver. Boot the system in recovery mode, which should verbosely
show which modules are being loaded and help narrow the possibilities.

Revision history for this message
Juan J. Martínez (jjmartinez) wrote :

# modprobe nvidia
WARNING: Error inserting agpgart
(/lib/modules/2.6.10-5-686/kernel/drivers/char/agp/agpgart.ko): Invalid module
format
FATAL: Error inserting nvidia
(/lib/modules/2.6.10-5-686/kernel/drivers/video/nvidia.ko): Unknown symbol in
module, or unknown parameter (see dmesg)

I see several "Module len 40259 truncated" in the dmesg.

The nvidia.ko is broken.

here:

linux-restricted-modules-686 2.6.10.5-1
linux-image-686 2.6.10-34.6

(yesterday all worked)

Revision history for this message
Juan J. Martínez (jjmartinez) wrote :

(In reply to comment #10)

The problem was (is) related to agp. Seems intel_agp or agpgart was corrupted :?

Using nvidia agp works, reinstalling linux-image-2.6.10-5-686 fixed it here.

But I don't know what broke it. Since USN-187-1 I've updated nothing related to
kernel.

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Is this problem still reproducible in breezy?
> >
> > Yes, this bug is still reproducible in Breezy Preview, even if I add 'nvidia' to
> > /etc/hotplug/blacklist.
>
> That isn't expected to have any effect; hotplug will not load that module
> regardless of whether it is listed in that file. It must be loaded explicitly
> via /etc/modules if you want to use it.
>
> > I have a question: Is there a way to install nVIDIA drivers via a LiveCD? I've a
> > Morphix LiveCD and it came with nVIDIA drivers. By the way, the Morphix LiveCD
> > is (!) 2 years old and it boots perfectly.
>
> You can install the nvidia-glx module using the Ubuntu live CD if you wish.
>
> So far, I see no indication whatsoever that your problem has anything to do with
> the nvidia driver. Boot the system in recovery mode, which should verbosely
> show which modules are being loaded and help narrow the possibilities.

I've run recovery mode a few times, but every time I'll get a different kind of
kernel panic with different kinds of "stacktraces". I've written 3 different
kind of panics. I don't know what it all means, but I hope that it can help you.

Kernel Panic #1
---------------

apic_timer_interrupt+0x1c/0x24
do_page_fault+0x0/0x484
add_pin_to_irq+0x4f/0x58
do_page_fault+0x67/0x484
sys_select+0x399/0x3a5
do_page_fault+0x0/0x484
error_code+0x4f/0x54

Kernel panic = not synching: Attempted to kill init!

---------------

Kernel Panic #2
---------------

__do_irq
do_IRQ
common_interrupt
__mod_page_state
page_remove_rmap
zap_pte_range
unmap_page_range
unmap_vmas
exit_mmap
mmput
do_exit
sys_exit_group
get_signal_to_deliver
do_signal
sys_select
sys_sysinfo
default_wake_function
do_page_fault
do_notify_resume
work_notifysig

Some hexidecimal code

<0>Kernel panic - not synching: Fatal exception in interrupt

---------------

Kernel panic #3
---------------

Modules linked in: intel_agp agpgart dm_mod evdev psmouse cdrom --- and lot of
other modules

CPU: 0
EIP: 0060:[<c010fa46>] Not tainted VLI
EFLAGS: 00010046 (2.6.12-8-386)

A stacktrace

A calltrace

apic_timer_interrupt

Some hexidecimal code

Kernel panic - not synching: Attempted to kill init!

----------------------

regards,

Alexander

Revision history for this message
Matt Zimmerman (mdz) wrote :

Run a memory test. A very thorough one is available from the GRUB menu (press
ESC at the countdown near the start of the boot process)

Revision history for this message
Adam Conrad (adconrad) wrote :

If your BIOS allows you to disable APIC, try that. If not, try booting the
kernel with the "noapic" option on the command line. Either way, this looks
like a generic kernel bug (or BIOS/firmware/hardware bug being tickled by the
kernel), not a restricted-modules bug. nvidia isn't hotplugged, and the fact
that it's an nVidia video card that appears to tickle this is a red herring, I'm
sure. Probably anything eating AGP memory would do the same, at a guess.

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

(In reply to comment #14)
> If your BIOS allows you to disable APIC, try that. If not, try booting the
> kernel with the "noapic" option on the command line. Either way, this looks
> like a generic kernel bug (or BIOS/firmware/hardware bug being tickled by the
> kernel), not a restricted-modules bug. nvidia isn't hotplugged, and the fact
> that it's an nVidia video card that appears to tickle this is a red herring, I'm
> sure. Probably anything eating AGP memory would do the same, at a guess.

I've searched for an option to disable APIC, but I didn't find an option like
that in the BIOS. I've also run memtest, but it seems that there is nothing
wrong my memory. I've also tried to boot the kernel with "noapic", but it didn't
help. I'm still getting the same kernel panics. There are two things that return
with each panic: IRQ and APIC. Are they closely related to each other? :S

I've also looked at the "devices" screen that appears just before GRUB. It seems
that some of the onboard devices have the same IRQ. Five or six of the onboard
devices have IRQ 10. Other onboard devices have IRQ 11. Has this something to do
with the bug? :S I thought that every device had a different IRQ with respect to
another device.

I also have a question about that "not synching" thingy. Does this mean that
there is some kind of IRQ conflict? Does the kernel assign a wrong IRQ to AGP? I
know what "synchronization" is. I'm programming a lot with Java and in Java you
also have to "synchronize" a method if you only allow one Thread at the time to
use that method. Does that principle apply to this? Does this mean that the
kernel tries to acces devices that have the same IRQ? (Or the other way around?)

regards,

Alexander

Revision history for this message
Ben Collins (ben-collins) wrote :

If possible, please upgrade to Dapper's 2.6.15-7 kernel. If you do not want to
upgrade to Dapper, then you can also wait for the Dapper Flight 2 CD's, which
are due out within the next few days.

Let me know if this bug still exists with this kernel.

Revision history for this message
Szabolcs Csermák (csszabolcs) wrote :

I've got the same problem, but I have an ATI Radeon 9800. I solved the problem with backlisting the intel_agp module.
Altought it works, but I cannot use the fglrx driver for Xorg :(

Revision history for this message
Szabolcs Csermák (csszabolcs) wrote :

The bug still exists in Dapper Drake Flight 5

Revision history for this message
Lionel Dricot (ploum-deactivatedaccount) wrote :

Szabolcs, we are not using hotplug anymore. Where does it hangs ?

Revision history for this message
Ben Collins (ben-collins) wrote :

This should be fixed with recent (probably atleast -20) kernels.

Changed in linux-source-2.6.15:
status: Needs Info → Fix Committed
Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

Hello,

i've tried Flight 6 and the bug still exists. My computer now hangs at ""Loading hardware drivers...". I've also a laptop with Breezy on it and it runs fine! :)

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

Hello,

I've also tried Dapper Beta (with -20 kernel) and the kernel still panics at "Loading hardware drivers...".

Changed in linux-source-2.6.15:
status: Fix Committed → Fix Released
Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

Hello Mr. Collins,

The fix you've released doesn't work on my computer. I'm getting the following message:

"<0>Kernel panic - not synching: Fatal exception in interrupt"

I could install Breezy on my computer, but it didn't boot. Now I can't even install Dapper :(

Changed in linux-source-2.6.15:
status: Fix Released → Confirmed
Revision history for this message
Alexandre Otto Strube (surak) wrote :

Perhaps only people with intel's chipset noticed it? See bug #55104

Revision history for this message
Launchpad Janitor (janitor) wrote : This bug is now reported against the 'linux' package

Beginning with the Hardy Heron 8.04 development cycle, all open Ubuntu kernel bugs need to be reported against the "linux" kernel package. We are automatically migrating this linux-source-2.6.15 kernel bug to the new "linux" package. We appreciate your patience and understanding as we make this transition. Also, if you would be interested in testing the upcoming Intrepid Ibex 8.10 release, it is available at http://www.ubuntu.com/testing . Please let us know your results. Thanks!

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Alexander Jurjens (alexanderjurjens) wrote :

Hello Developers,

I've downloaded and tried Ubuntu 8.10 Alpha 5 LiveCD and the bug hasn't been resolved in linux kernel 2.6.27. I still get a lot of errors @ boottime. The system does not boot at all.

Regards,

Alexander

Revision history for this message
Lionel Dricot (ploum-deactivatedaccount) wrote :

Can any reporter try to reproduce this bug with Jaunty Alpha 5 or later ? The kernel used is 2.6.28.

David Wynn (wynn-david)
tags: added: dapper intrepid
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Unassigned from Ben Collins. Marked Invalid. If this is still being experienced in Karmic or Lucid, please open a new bug with the relevant apport data.

-JFo

Changed in linux (Ubuntu):
assignee: Ben Collins (ben-collins) → nobody
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.