Intel 440BX: 'hda: I/O error'

Bug #49265 reported by Pascal de Bruijn
8
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

After installing Ubuntu Dapper (server edition) on my old server, I noticed a I/O error on my harddrive (hda).

At first I suspected that the harddrive was dying, to I replaced the harddrive. The I/O error remained.

Consequently I replaced the UATA/IDE flat cables, which also didn't solve the issue.

Pondering whether I might have blown up my UATA/IDE ports, I tried to install another Linux distribution: CERN Linux 4.2 (it's RHEL4U2 derivative), and CERN Linux just ran fine, without giving any I/O errors. So my UATA/IDE ports are probably just fine.

I'm having this issue on a Tyan Tsunami ATX motherboard with an Intel 440BX chipset, in concert with an Intel Pentium II ECC 300. I'm also using ECC SDRAM.

pmjdebruijn@tsunami:~$ uname -a
Linux tsunami 2.6.15-23-server #1 SMP Tue May 23 15:10:35 UTC 2006 i686 GNU/Linux

pmjdebruijn@tsunami:~$ lspci
0000:00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03)
0000:00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03)
0000:00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
0000:00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
0000:00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
0000:00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
0000:00:12.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
0000:00:13.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 05)
0000:01:00.0 VGA compatible controller: Intel Corporation 82740 (i740) AGP Graphics Accelerator (rev 21)

pmjdebruijn@tsunami:~$ dmesg
...
[ 60.582008] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 60.582041] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=78230639, sector=78165360
[ 60.582078] ide: failed opcode was: unknown
[ 60.754348] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 60.754380] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=78230639, sector=78165360
[ 60.754418] ide: failed opcode was: unknown
[ 60.926728] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 60.926760] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=78230639, sector=78165360
[ 60.926798] ide: failed opcode was: unknown
[ 61.098983] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 61.099016] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=78230639, sector=78165360
[ 61.099053] ide: failed opcode was: unknown
[ 61.139543] ide0: reset: success
[ 61.311921] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 61.311953] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=78230639, sector=78165360
[ 61.311991] ide: failed opcode was: unknown
[ 61.312141] end_request: I/O error, dev hda, sector 78165360
[ 61.312167] Buffer I/O error on device hda, logical block 78165360
...

Is it safe to ignore these errors?

Anyway considering the fact that CERN Linux ran fine (kernel 2.6.9 with patches), it seems like there might have crept a buggy patch into the Dapper kernel.

Revision history for this message
Jeff Balderson (jbalders) wrote :

It looks to me like you've been bitten by this bug:

https://launchpad.net/distros/ubuntu/+source/linux-source-2.6.15/+bug/26119/+index

If you do a 'hdparm /dev/hda' you should also likely notice that "using_dma = 0 (off)". That seems to be the other symptom of this problem.

Revision history for this message
Pascal de Bruijn (pmjdebruijn) wrote :

That could very well be the case:

pmjdebruijn@tsunami:~$ sudo hdparm /dev/hda

/dev/hda:
 multcount = 0 (off)
 IO_support = 0 (default 16-bit)
 unmaskirq = 0 (off)
 using_dma = 0 (off)
 keepsettings = 0 (off)
 readonly = 0 (off)
 readahead = 256 (on)
 geometry = 65535/16/63, sectors = 78165361, start = 0
pmjdebruijn@tsunami:~$

My healty system differs quite a bit:
 IO_support = 1 (32-bit)
 unmaskirq = 1 (on)
 using_dma = 1 (on)

So yes, I could possibly have been bitten by that bug.

But, could the disabled DMA not by the results of those IDE errors. Instead of the other way around? It seems logical that when multiple IDE errors occur DMA gets disabled to rule out a faulty DMA controller.

Revision history for this message
Rouben (rouben) wrote :

A couple of suggestions:
1. Enable DMA using hdparm and see if that gets rid of the errors.
2. Try getting the latest BIOS revision for your motherboard.

Revision history for this message
Pascal de Bruijn (pmjdebruijn) wrote :

I already have a latest Tyan BIOS installed on the machine.

And DMA got disabled because of those errors, not dispite them.

Revision history for this message
Pascal de Bruijn (pmjdebruijn) wrote :

I need to test, if this problem persists with Edgy Eft.

Revision history for this message
Rouben (rouben) wrote :

Can someone confirm if this bug is indeed a duplicate of bug # 26119? If so, I'd like to mark it as such. Thanks!

Rouben (rouben)
Changed in linux-source-2.6.15:
status: Unconfirmed → Needs Info
Revision history for this message
Paolo Sammicheli (xdatap1) wrote :

may you confirm us if this problem still persist on Edgy?

Changed in linux-source-2.6.15:
assignee: nobody → xdatap1
Revision history for this message
Jeff Balderson (jbalders) wrote :

Rouben:

I can't confirm whether this is actually a duplicate of 26119, but it sounds virtually identical. The problem is only with particular drives. The Seagate ST320413A referenced below causes it, but neither a Seagate ST320420A nor a Western Digital "WDC WD400EB-00CPF0" do.

Paolo:

Assuming it's the same bug, yes, it still persists. This is on a Sun Ultra 5 that's had the problem since it was first reported -- I can't remember whether it started in Breezy or Dapper.

root@ull:/etc# cat lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=6.10
DISTRIB_CODENAME=edgy
DISTRIB_DESCRIPTION="Ubuntu 6.10"

root@ull:/etc# cat /proc/ide/hda/model
ST320413A

---[partial dmesg]---
[ 36.653285] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[ 36.732586] hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=39102336, sector=39102336
[ 36.834668] ide: failed opcode was: unknown
[ 37.025125] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[ 37.104357] hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=39102336, sector=39102336
[ 37.206434] ide: failed opcode was: unknown
[ 37.397036] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[ 37.476233] hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=39102336, sector=39102336
[ 37.578311] ide: failed opcode was: unknown
[ 37.768906] hda: dma_intr: status=0x51 { DriveReady SeekComplete Error }
[ 37.848109] hda: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=39102336, sector=39102336
[ 37.950187] ide: failed opcode was: unknown
[ 38.000199] hda: DMA disabled
[ 38.094499] ide0: reset: success

Since my Sparc shows the exact same symptoms, it's likely a chipset problem or related to particular drives.

Revision history for this message
Jeff Balderson (jbalders) wrote :

I probably should have included a bit more of the dmesg output. Here is where it's the same. The above section immediately preceded this:

[ 38.303408] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 38.399361] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=39167615, sector=39102336
[ 38.505605] ide: failed opcode was: unknown
[ 38.726907] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 38.822800] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=39167615, sector=39102336
[ 38.929044] ide: failed opcode was: unknown
[ 39.150264] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 39.246131] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=39167615, sector=39102336
[ 39.352381] ide: failed opcode was: unknown
[ 39.573516] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 39.669469] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=39167615, sector=39102336
[ 39.775716] ide: failed opcode was: unknown
[ 39.870498] ide0: reset: success
[ 40.079414] hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
[ 40.175304] hda: task_in_intr: error=0x10 { SectorIdNotFound }, LBAsect=39167615, sector=39102336
[ 40.281546] ide: failed opcode was: unknown
[ 40.331693] end_request: I/O error, dev hda, sector 39102336
[ 40.399265] Buffer I/O error on device hda, logical block 39102336

Changed in linux-source-2.6.15:
assignee: xdatap1 → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux-source-2.6.15 (Ubuntu) because there has been no activity for 60 days.]

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.