IOCStatus(0x004b): SCSI IOC Terminated, mptscsih task abort

Bug #140032 reported by Kent Tong
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.15-29-server

I am running Ubuntu dapper drake LTS 6.0.6 on a server both as a VM host and as a
VM guest. Recently, after upgrading both from linux-image-2.6.15-28-server to
linux-image-2.6.15-29-server, I am seeing frequent errors in the guest (see below
from kern.log). The behavior is probably the same as Bug #137585, but we aren't using
an LSI logic controller. The server is an HP Proliant DL server and the driver is named
"cciss". There is no such error in the host.

Sep 10 05:09:32 cladms003 kernel: [43422775.870000] mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated
Sep 10 05:09:32 cladms003 kernel: [43422775.880000] mptscsih: ioc0: task abort: SUCCESS (sc=c6815540)
Sep 10 05:15:11 cladms003 kernel: [43423115.100000] mptscsih: ioc0: attempting task abort! (sc=c5c4ab00)
Sep 10 05:15:11 cladms003 kernel: [43423115.100000] sd 0:0:1:0:
Sep 10 05:15:11 cladms003 kernel: [43423115.100000] command: Read (10): 28 00 04 15 0d a0 00 00 08 00
Sep 10 05:15:11 cladms003 kernel: [43423115.100000] mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated
Sep 10 05:15:11 cladms003 kernel: [43423115.110000] mptscsih: ioc0: task abort: SUCCESS (sc=c5c4ab00)
Sep 10 05:19:52 cladms003 kernel: [43423395.750000] mptscsih: ioc0: attempting task abort! (sc=c5c4a200)
Sep 10 05:19:52 cladms003 kernel: [43423395.750000] sd 0:0:1:0:
Sep 10 05:19:52 cladms003 kernel: [43423395.750000] command: Read (10): 28 00 04 15 0e 30 00 00 08 00
Sep 10 05:19:52 cladms003 kernel: [43423395.750000] mptbase: ioc0: IOCStatus(0x004b): SCSI IOC Terminated
Sep 10 05:19:52 cladms003 kernel: [43423395.760000] mptscsih: ioc0: task abort:SUCCESS (sc=c5c4a200)

Revision history for this message
seisen1 (seisen-deactivatedaccount-deactivatedaccount) wrote :

Can you please test this in latest version of Ubuntu, Hardy Heron, to see if this is still a problem?

Revision history for this message
Kent Tong (kent-tong) wrote :

Thanks for the follow up. Unfortunately we are unable to do that as it is a production server.

Revision history for this message
Launchpad Janitor (janitor) wrote : This bug is now reported against the 'linux' package

Beginning with the Hardy Heron 8.04 development cycle, all open Ubuntu kernel bugs need to be reported against the "linux" kernel package. We are automatically migrating this linux-source-2.6.15 kernel bug to the new "linux" package. We appreciate your patience and understanding as we make this transition. Also, if you would be interested in testing the upcoming Intrepid Ibex 8.10 release, it is available at http://www.ubuntu.com/testing . Please let us know your results. Thanks!

Revision history for this message
Leann Ogasawara (leannogasawara) wrote : Re: IOCStatus(0x004b): SCSI IOC Terminated

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Kent,

Just wanted to check in on this bug. It was reported a while ago and hasn't had any recent activity. Is the machine in question still running Dapper or has it possibly been updated to the newer Hardy LTS release or possibly even Intrepid? Thanks.

Changed in linux:
status: New → Incomplete
Revision history for this message
Kent Tong (kent-tong) wrote : Re: [Bug 140032] Re: IOCStatus(0x004b): SCSI IOC Terminated

Leann Ogasawara wrote:
> Hi Kent,
>
> Just wanted to check in on this bug. It was reported a while ago and
> hasn't had any recent activity. Is the machine in question still
> running Dapper or has it possibly been updated to the newer Hardy LTS
> release or possibly even Intrepid? Thanks.
>
> ** Changed in: linux (Ubuntu)
> Status: New => Incomplete
>

We're still running Dapper. We can't upgrade to a newer release yet.

--
Kent Tong
Useful news for network admins at
http://www2.cpttm.org.mo/cyberlab/netadmin-news

Revision history for this message
Pavel Zheltouhov (pwlnw) wrote : Re: IOCStatus(0x004b): SCSI IOC Terminated
Download full text (5.5 KiB)

VmWare Server2.0 on Host Interpid 8.10 2.6.27-9-server x86_64 and same for guest.
I install linux-image-virtual and linux-virtual packages.
I reproduce this bug many times under heavy disk load.
It seams we need option on driver mptscsih or in filesystems which turn off timeouts and verification.
Patching kernel for package linux-image-virtual will help too.

Here is my syslog .

Sometimes all goes good, driver succesfully reset

Jan 24 19:37:22 uadb kernel: [17989.370267] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac123c0)
Jan 24 19:37:22 uadb kernel: [17989.370268] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d0 37 00 04 00 00
Jan 24 19:37:22 uadb kernel: [17989.370273] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac123c0)
Jan 24 19:37:22 uadb kernel: [17989.370292] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac12640)
Jan 24 19:37:22 uadb kernel: [17989.370294] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d4 37 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17989.370298] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac12640)
Jan 24 19:37:22 uadb kernel: [17989.370318] mptscsih: ioc0: attempting task abort! (sc=ffff88006ac12c80)
Jan 24 19:37:22 uadb kernel: [17989.370319] sd 2:0:2:0: [sdc] CDB: Write(10): 2a 00 01 43 d4 3f 00 00 90 00
Jan 24 19:37:22 uadb kernel: [17989.370324] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88006ac12c80)
Jan 24 19:37:22 uadb kernel: [17989.370344] mptscsih: ioc0: attempting task abort! (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17989.370346] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 17 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17991.772561] mptbase: ioc0: Initiating recovery
Jan 24 19:37:22 uadb kernel: [17993.342132] mptscsih: ioc0: Issue of TaskMgmt failed!
Jan 24 19:37:22 uadb kernel: [17993.342224] mptscsih: ioc0: task abort: FAILED (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17993.342226] mptscsih: ioc0: attempting task abort! (sc=ffff88003788da00)
Jan 24 19:37:22 uadb kernel: [17993.342229] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 2f 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17993.342235] mptscsih: ioc0: task abort: SUCCESS (sc=ffff88003788da00)
Jan 24 19:37:22 uadb kernel: [17993.342362] mptscsih: ioc0: attempting target reset! (sc=ffff88003788d140)
Jan 24 19:37:22 uadb kernel: [17993.342364] sd 2:0:1:0: [sdb] CDB: Read(10): 28 00 00 be b5 17 00 00 08 00
Jan 24 19:37:22 uadb kernel: [17993.342828] scsi target2:0:0: Beginning Domain Validation
Jan 24 19:37:22 uadb kernel: [17993.620049] mptscsih: ioc0: target reset: SUCCESS (sc=ffff88003788d140)

but in really heavy load :

Jan 24 20:51:49 uadb kernel: [22458.461947] mptscsih: ioc0: bus reset: FAILED (sc=ffff880068b81280)
Jan 24 20:51:49 uadb kernel: [22458.461958] mptscsih: ioc0: attempting host reset! (sc=ffff880068b81280)
Jan 24 20:51:49 uadb kernel: [22458.461974] mptbase: ioc0: Initiating recovery
Jan 24 20:51:49 uadb kernel: [22459.601222] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b818c0, mf = ffff880174583e80, idx=3c
Jan 24 20:51:49 uadb kernel: [22459.601236] sd 2:0:1:0: mptscsih: ioc0: completing cmds: fw_channel 0, fw_id 1, sc=ffff880068b81dc0, mf = ffff...

Read more...

Revision history for this message
Pavel Zheltouhov (pwlnw) wrote :

Switching from virtual scsi to ide will help.
Complete description of workaround found here http://lenrek.wordpress.com/2008/02/22/linux-software-raid-vmware/

Revision history for this message
Pavel Zheltouhov (pwlnw) wrote :

Reproduced in Interpid too.

Changed in linux:
status: Incomplete → New
Revision history for this message
xteejx (xteejx-deactivatedaccount) wrote :

Setting medium importance.

pwlnw: Is there any chance you could test this in Jaunty and see if this is still an issue in the 2.6.28 kernel?

Thank you.

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
kernel-janitor (kernel-janitor) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
Pavel Zheltouhov (pwlnw) wrote :

reproduced in Janty

Changed in linux (Ubuntu):
status: Invalid → New
Revision history for this message
xteejx (xteejx-deactivatedaccount) wrote :

Can you try with Karmic please, as the latest version may have this fixed. Thank you.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
xteejx (xteejx-deactivatedaccount) wrote :

We are closing this bug report because it lacks the information we need to investigate the problem, as described in the previous comments. Please reopen it if you can give us the missing information, and don't hesitate to submit bug reports in the future. To reopen the bug report you can click on the current status, under the Status column, and change the Status back to "New". Thanks again!

Changed in linux (Ubuntu):
importance: Medium → Undecided
status: Incomplete → Invalid
Revision history for this message
Zrin Ziborski (zrin+launchpad) wrote :

Reproduced in Lucid.

Possible workaround: convert vmware disks to IDE.

It could be that it is a kernel driver (mptscsih) problem that affects some real hardware as well,
(searched for "mptscsih: ioc0: attempting task abort!") -
perhaps the driver times out to soon for specific operations or something like that?

summary: - IOCStatus(0x004b): SCSI IOC Terminated
+ IOCStatus(0x004b): SCSI IOC Terminated, mptscsih task abort
Revision history for this message
KSB (ksb-inbox) wrote :

Pretty sure it's the same on my Lucid with 2.6.32-24 and also on 2.6.35.4 custom built kernel. Seen only on heavy disk writes (not reads and deletes). Earlier or later it kicks out one of hdd's off softraid if hdd is in it.
It is more like kernel bug than kernel package bug in ubuntu...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.