random soft lockup with external usb harddisk

Bug #78433 reported by Morten Minke
2
Affects Status Importance Assigned to Milestone
linux-source-2.6.17 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I have a q-tec external USB harddisk case with a 250 Gb harddisk.
When I use this harddisk to render films or use it intensively in another way, quite often the system hangs. If I, without restarting linux, just pull out the power cord of the external disc, Ubuntu starts to run again and (off course) displays a message of an unsafe removal of the disc.
I have looked at other soft-lockup bugs but when the lockups occur there is no wireless card in the computer (which in a lot of other reports is part of the problem). The lockup also only happens when we use the harddisk intense, i.e. using rhythmbox with music stored on the harddisk gives NO problem, but rendering videos with cinnelera does. It also happens when I for example try to do a simple backup with tar of my home directory to the external drive (and that does not use exotic stuff).

My computer is a IBM A31 laptop with 1 Gb of mem and 20 Gb HD.
The external HD is connected through a PCMCIA firewire/USB2.0 combo card (which might also be part of the problem, that I can not tell).

The following is list from the kern.log and the uname -a info:

Linux amagi 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux

Dec 15 14:55:30 localhost kernel: [17179683.892000] pccard: CardBus card inserted into slot 0
Dec 15 14:55:30 localhost kernel: [17179683.892000] PCI: Enabling device 0000:03:00.0 (0000 -> 0003)
Dec 15 14:55:30 localhost kernel: [17179683.892000] ACPI: PCI Interrupt 0000:03:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
Dec 15 14:55:30 localhost kernel: [17179683.892000] uhci_hcd 0000:03:00.0: UHCI Host Controller
Dec 15 14:55:30 localhost kernel: [17179683.892000] uhci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 4
Dec 15 14:55:30 localhost kernel: [17179683.892000] uhci_hcd 0000:03:00.0: irq 11, io base 0x00004080
Dec 15 14:55:30 localhost kernel: [17179683.892000] usb usb4: configuration #1 chosen from 1 choice
Dec 15 14:55:30 localhost kernel: [17179683.892000] hub 4-0:1.0: USB hub found
Dec 15 14:55:30 localhost kernel: [17179683.892000] hub 4-0:1.0: 2 ports detected
Dec 15 14:55:30 localhost kernel: [17179683.996000] PCI: Enabling device 0000:03:00.1 (0000 -> 0003)
Dec 15 14:55:30 localhost kernel: [17179683.996000] ACPI: PCI Interrupt 0000:03:00.1[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
Dec 15 14:55:30 localhost kernel: [17179683.996000] uhci_hcd 0000:03:00.1: UHCI Host Controller
Dec 15 14:55:30 localhost kernel: [17179683.996000] uhci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 5
Dec 15 14:55:30 localhost kernel: [17179683.996000] uhci_hcd 0000:03:00.1: irq 11, io base 0x000040a0
Dec 15 14:55:30 localhost kernel: [17179683.996000] usb usb5: configuration #1 chosen from 1 choice
Dec 15 14:55:30 localhost kernel: [17179683.996000] hub 5-0:1.0: USB hub found
Dec 15 14:55:30 localhost kernel: [17179683.996000] hub 5-0:1.0: 2 ports detected
Dec 15 14:55:31 localhost kernel: [17179684.352000] PCI: Enabling device 0000:03:00.2 (0000 -> 0002)
Dec 15 14:55:31 localhost kernel: [17179684.352000] ACPI: PCI Interrupt 0000:03:00.2[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
Dec 15 14:55:31 localhost kernel: [17179684.352000] ehci_hcd 0000:03:00.2: EHCI Host Controller
Dec 15 14:55:31 localhost kernel: [17179684.352000] ehci_hcd 0000:03:00.2: new USB bus registered, assigned bus number 6
Dec 15 14:55:31 localhost kernel: [17179684.352000] ehci_hcd 0000:03:00.2: irq 11, io mem 0xd2000a00
Dec 15 14:55:31 localhost kernel: [17179684.352000] ehci_hcd 0000:03:00.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
Dec 15 14:55:31 localhost kernel: [17179684.352000] usb usb6: configuration #1 chosen from 1 choice
Dec 15 14:55:31 localhost kernel: [17179684.352000] hub 6-0:1.0: USB hub found
Dec 15 14:55:31 localhost kernel: [17179684.352000] hub 6-0:1.0: 4 ports detected
Dec 15 14:55:31 localhost kernel: [17179684.364000] usb 4-2: new full speed USB device using uhci_hcd and address 2
Dec 15 14:55:31 localhost kernel: [17179684.924000] usb 6-2: new high speed USB device using ehci_hcd and address 2
Dec 15 14:55:31 localhost kernel: [17179684.992000] PCI: Enabling device 0000:03:00.3 (0080 -> 0083)
Dec 15 14:55:31 localhost kernel: [17179684.992000] ACPI: PCI Interrupt 0000:03:00.3[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
Dec 15 14:55:31 localhost kernel: [17179685.044000] ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[11] MMIO=[d2000000-d20007ff] Max Packet=[1024] IR/IT contexts=[4/8]
Dec 15 14:55:32 localhost kernel: [17179685.064000] usb 6-2: configuration #1 chosen from 1 choice
Dec 15 14:55:33 localhost kernel: [17179687.052000] usbcore: registered new driver libusual
Dec 15 14:55:34 localhost kernel: [17179687.200000] SCSI subsystem initialized
Dec 15 14:55:34 localhost kernel: [17179687.220000] Initializing USB Mass Storage driver...
Dec 15 14:55:34 localhost kernel: [17179687.220000] scsi0 : SCSI emulation for USB Mass Storage devices
Dec 15 14:55:34 localhost kernel: [17179687.220000] usbcore: registered new driver usb-storage
Dec 15 14:55:34 localhost kernel: [17179687.220000] USB Mass Storage support registered.
Dec 15 14:55:39 localhost kernel: [17179692.224000] Vendor: Maxtor 7 Model: L250R0 Rev: 0811
Dec 15 14:55:39 localhost kernel: [17179692.224000] Type: Direct-Access ANSI SCSI revision: 00
Dec 15 14:55:39 localhost kernel: [17179692.316000] SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
Dec 15 14:55:39 localhost kernel: [17179692.316000] sda: test WP failed, assume Write Enabled
Dec 15 14:55:39 localhost kernel: [17179692.320000] SCSI device sda: 490234752 512-byte hdwr sectors (251000 MB)
Dec 15 14:55:39 localhost kernel: [17179692.324000] sda: test WP failed, assume Write Enabled
Dec 15 14:55:39 localhost kernel: [17179692.324000] sda: sda1
Dec 15 14:55:39 localhost kernel: [17179692.344000] sd 0:0:0:0: Attached scsi disk sda
Dec 15 14:55:39 localhost kernel: [17179692.364000] sd 0:0:0:0: Attached scsi generic sg0 type 0
Dec 15 14:55:40 localhost kernel: [17179693.168000] kjournald starting. Commit interval 5 seconds
Dec 15 14:55:40 localhost kernel: [17179693.168000] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Dec 15 14:55:40 localhost kernel: [17179693.180000] EXT3 FS on sda1, internal journal
Dec 15 14:55:40 localhost kernel: [17179693.180000] EXT3-fs: mounted filesystem with ordered data mode.
Dec 15 15:14:17 localhost -- MARK --
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c01491cf> softlockup_tick+0x9f/0xf0 <c012bee1> update_process_times+0x31/0x80
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c01070ac> timer_interrupt+0x7c/0xb0 <c0149323> handle_IRQ_event+0x33/0x60
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c01493ed> __do_IRQ+0x9d/0x110 <c0105c89> do_IRQ+0x19/0x30
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c010408a> common_interrupt+0x1a/0x20 <f8a4f85e> yenta_interrupt+0x1e/0xe0 [yenta_socket]
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c0149323> handle_IRQ_event+0x33/0x60 <c01493ed> __do_IRQ+0x9d/0x110
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c0105c89> do_IRQ+0x19/0x30 <c010408a> common_interrupt+0x1a/0x20
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c014007b> disk_store+0x10b/0x12e <c012782f> __do_softirq+0x5f/0xe0
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c01278e5> do_softirq+0x35/0x40 <c0105c8e> do_IRQ+0x1e/0x30
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c010408a> common_interrupt+0x1a/0x20 <f8bb007b> drm_vm_shm_close+0x11b/0x1e0 [drm]
Dec 15 15:24:04 localhost kernel: [17181388.796000] <f8bbc902> nvram_read+0x72/0xd0 [nvram] <c0138da3> enqueue_hrtimer+0x53/0x80
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c0138f47> lock_hrtimer_base+0x27/0x60 <c013901d> hrtimer_try_to_cancel+0x2d/0x50
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c013904e> hrtimer_cancel+0xe/0x20 <c02da363> do_nanosleep+0x53/0x70
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c0139199> hrtimer_nanosleep+0x49/0x120 <c02dac80> lock_kernel+0x20/0x40
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c016b05c> vfs_read+0xbc/0x180 <f8bbc890> nvram_read+0x0/0xd0 [nvram]
Dec 15 15:24:04 localhost kernel: [17181388.796000] <c016b5d1> sys_read+0x41/0x70 <c0102fbb> sysenter_past_esp+0x54/0x79
Dec 15 15:24:04 localhost kernel: [17181397.012000] usb 6-2: USB disconnect, address 2
Dec 15 15:24:04 localhost kernel: [17181397.020000] sd 0:0:0:0: SCSI error: return code = 0x10000
Dec 15 15:24:04 localhost kernel: [17181397.020000] end_request: I/O error, dev sda, sector 372159463
Dec 15 15:24:04 localhost kernel: [17181397.020000] lost page write due to I/O error on sda1
Dec 15 15:24:04 localhost last message repeated 9 times
Dec 15 15:24:04 localhost kernel: [17181397.020000] sd 0:0:0:0: SCSI error: return code = 0x10000
Dec 15 15:24:04 localhost kernel: [17181397.020000] end_request: I/O error, dev sda, sector 372159527
Dec 15 15:24:17 localhost kernel: [17181410.104000] usb 6-2: new high speed USB device using ehci_hcd and address 3
Dec 15 15:24:17 localhost kernel: [17181410.236000] usb 6-2: configuration #1 chosen from 1 choice
Dec 15 15:24:17 localhost kernel: [17181410.236000] scsi1 : SCSI emulation for USB Mass Storage devices
Dec 15 15:24:22 localhost kernel: [17181415.240000] Vendor: Maxtor 7 Model: L250R0 Rev: 0811
Dec 15 15:24:22 localhost kernel: [17181415.240000] Type: Direct-Access ANSI SCSI revision: 00
Dec 15 15:24:22 localhost kernel: [17181415.240000] SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
Dec 15 15:24:22 localhost kernel: [17181415.244000] sdb: test WP failed, assume Write Enabled
Dec 15 15:24:22 localhost kernel: [17181415.244000] SCSI device sdb: 490234752 512-byte hdwr sectors (251000 MB)
Dec 15 15:24:22 localhost kernel: [17181415.248000] sdb: test WP failed, assume Write Enabled
Dec 15 15:24:22 localhost kernel: [17181415.248000] sdb: sdb1
Dec 15 15:24:22 localhost kernel: [17181415.268000] sd 1:0:0:0: Attached scsi disk sdb
Dec 15 15:24:22 localhost kernel: [17181415.268000] sd 1:0:0:0: Attached scsi generic sg0 type 0
Dec 15 15:24:26 localhost kernel: [17181419.140000] kjournald starting. Commit interval 5 seconds
Dec 15 15:24:26 localhost kernel: [17181419.144000] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
Dec 15 15:24:26 localhost kernel: [17181419.144000] EXT3 FS on sdb1, internal journal
Dec 15 15:24:26 localhost kernel: [17181419.144000] EXT3-fs: recovery complete.
Dec 15 15:24:26 localhost kernel: [17181419.148000] EXT3-fs: mounted filesystem with ordered data mode.

Revision history for this message
Kyle McMartin (kyle) wrote :

Can you please try booting with "noreplacement" on the kernel command line?

Kyle McMartin (kyle)
Changed in linux-source-2.6.17:
status: Unconfirmed → Needs Info
Revision history for this message
Morten Minke (morten-amagi) wrote :

I have tried the noreplacement option on the kernel command line (boot option) but with that option my X windows server won't start. Is there another option I should/could try or do you know what the problem with the X server is (I know it is not an issue for this bug, but if I can get the X server running I can give you more info whether or not that option is making a difference).

Revision history for this message
Morten Minke (morten-amagi) wrote :

I have some additional info which might help.

First of all, with all the kernel updates up until know the problem persists. However, it appears that it is not specific to the USB external disk. It also occurs every now and then if I use a 12in1 cardreader to read my photos from a flashcard.

Another thing I found out by accident is the following. The PCMCIA card has 2 USB 2.0 ports. If I use one for the external HD and one for the 12in1 cardreader and I get the soft lockup. If I then pull out the cardreader, Linux suddenly continues. This is 100% everytime the problem occurs. So for now my workarround is to use the external disk and if something happens I take out the cardreader and if everything is up and running put the cardreader in again. Below is another listing which was taken after the system recovered from the soft lockup when I pulled out the cardreader.

[17180389.132000] EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
[17180389.264000] EXT3 FS on sde1, internal journal
[17180389.264000] EXT3-fs: mounted filesystem with ordered data mode.
[17180435.052000] BUG: soft lockup detected on CPU#0!
[17180435.052000] <c01491cf> softlockup_tick+0x9f/0xf0 <c012bee1> update_process_times+0x31/0x80
[17180435.052000] <c01070ac> timer_interrupt+0x7c/0xb0 <c0149323> handle_IRQ_event+0x33/0x60
[17180435.052000] <c01493ed> __do_IRQ+0x9d/0x110 <c0105c89> do_IRQ+0x19/0x30
[17180435.052000] <c010408a> common_interrupt+0x1a/0x20 <c0149307> handle_IRQ_event+0x17/0x60
[17180435.052000] <c01493ed> __do_IRQ+0x9d/0x110 <c0105c89> do_IRQ+0x19/0x30
[17180435.052000] <c010408a> common_interrupt+0x1a/0x20 <c014007b> disk_store+0x10b/0x12e
[17180435.052000] <c012782f> __do_softirq+0x5f/0xe0 <c01278e5> do_softirq+0x35/0x40
[17180435.052000] <c0105c8e> do_IRQ+0x1e/0x30 <c010408a> common_interrupt+0x1a/0x20
[17180435.052000] <f8cd6902> nvram_read+0x72/0xd0 [nvram] <c0138f47> lock_hrtimer_base+0x27/0x60
[17180435.052000] <c013901d> hrtimer_try_to_cancel+0x2d/0x50 <c013904e> hrtimer_cancel+0xe/0x20
[17180435.052000] <c02da593> do_nanosleep+0x53/0x70 <c0139199> hrtimer_nanosleep+0x49/0x120
[17180435.052000] <c02daeb0> lock_kernel+0x20/0x40 <c016af6c> vfs_read+0xbc/0x180
[17180435.052000] <f8cd6890> nvram_read+0x0/0xd0 [nvram] <c016b4e1> sys_read+0x41/0x70
[17180435.052000] <c0102fbb> sysenter_past_esp+0x54/0x79
[17180472.272000] usb 6-2: USB disconnect, address 2

Hope this helps. Bye the way, I first thought of installing the kernel debug version but then I read the description and that it is of no use to boot from it. If I can help in any other way by providing more info by activating extended logging or whatever, please explain what I need to do.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux-source-2.6.17 (Ubuntu) because there has been no activity for 60 days.]

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.