sata-via: read errors, slowdowns with VIA VT6420

Bug #676644 reported by albatros
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Andy Whitcroft
Maverick
Fix Released
High
Andy Whitcroft

Bug Description

I have noticed slowdowns when reading from a Western Digital Green Power Disk (WD10EARS) when the disk is attached to my mainboard's (Jetway J7F4) sata-controller (VT6420 (in a VT8237R+ southbridge).

The kernel errors are repetitions of the following:

ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
[ 476.700552] ata1.00: BMDMA stat 0x25
[ 476.700614] ata1.00: failed command: READ DMA EXT
[ 476.700691] ata1.00: cmd 25/00:00:00:36:6e/00:02:22:00:00/e0 tag 0 dma 262144 in
[ 476.700697] res 51/84:bf:00:36:6e/84:01:22:00:00/e0 Emask 0x10 (ATA bus error)
[ 476.700824] ata1.00: status: { DRDY ERR }
[ 476.700884] ata1.00: error: { ICRC ABRT }
[ 476.700952] ata1: soft resetting link
[ 476.888749] ata1.00: configured for UDMA/133
[ 476.888823] ata1: EH complete

and if the controller is stressed by a heavier load the errors will repeat enough to cause the interface to throttle back to at most UDMA/33.

This bug seems to be related to an issue that has been reported with VT6421 controllers combined with WD green power disks. As far as I know it has something to do with a buffer mismatch between disk and controller.

A patch has recently been proposed ([PATCH] sata-via: enable magic transmission fix for vt6420), details can be found on https://patchwork.kernel.org/patch/323272/

Revision history for this message
albatros (jda) wrote :

Apparently the driver throttles the disk back to PIO0 eventually... quite unusable.

I have also spotted some different errors:

[13873.934590] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[13873.934684] ata3.00: failed command: READ MULTIPLE
[13873.934789] ata3.00: cmd c4/00:08:90:d3:eb/00:00:00:00:00/e6 tag 0 pio 4096 in
[13873.934804] res 58/00:07:91:d3:eb/84:01:06:00:00/e6 Emask 0x2 (HSM violation)
[13873.934957] ata3.00: status: { DRDY DRQ }
[13873.935027] ata3: soft resetting link
[13874.105263] ata3.00: configured for PIO0
[13874.105335] ata3: EH complete
[13874.114626] ata3: drained 65536 bytes to clear DRQ.
[13874.174559] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[13874.174652] ata3.00: failed command: READ MULTIPLE
[13874.174756] ata3.00: cmd c4/00:08:90:d3:eb/00:00:00:00:00/e6 tag 0 pio 4096 in
[13874.174772] res 58/00:07:91:d3:eb/84:01:06:00:00/e6 Emask 0x2 (HSM violation)
[13874.174924] ata3.00: status: { DRDY DRQ }
[13874.174995] ata3: soft resetting link
[13874.345499] ata3.00: configured for PIO0
[13874.345571] ata3: EH complete

Revision history for this message
albatros (jda) wrote :

Output of lspci -nnvv

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Andy Whitcroft (apw) wrote :

It seems that this change has hit Linus' tree as the commit below:

  commit b1353e4f40f6179ab26a3bb1b2e1fe29ffe534f5
  Author: Tejun Heo <email address hidden>
  Date: Fri Nov 19 15:29:19 2010 +0100

    sata_via: apply magic FIFO fix to vt6420 too

I have pulled this fix back to Maveric and built some test kernels. Could those of you affected please test these kernels and report back here. These kernels are at the URL below:

    http://people.canonical.com/~apw/lp676644-maverick/

Thanks.

Revision history for this message
albatros (jda) wrote :

I have done some fairly heavy testing/reading multiple gigabytes at high speed, no more errors.

It looks like the issue has been fixed in Linux Hive 2.6.35-24-generic #41~lp676644v201011291520 SMP Mon Nov 29 15:22:32 UTC 2010 i686 GNU/Linux provided at the link above.

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
status: In Progress → Fix Released
Changed in linux (Ubuntu Maverick):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Andy Whitcroft (apw)
Revision history for this message
albatros (jda) wrote :

I have been using the test kernel for almost two months now, it has been working without problems. But I have not spotted this bug in the two kernel-updates that have been distributed to maverick in the meantime. Has this been fixed in Maverick yet?

Revision history for this message
albatros (jda) wrote :

I meant 'I have not spotted this fix' instead of 'I have not spotted this bug'. Unfortunately. I think the fix has been tested enough, I have faith in it to not cause any adverse effects.

Revision history for this message
mabawsa (mabawsa) wrote :

I am having the same issues so I marked my bug as a duplicate.

Revision history for this message
mabawsa (mabawsa) wrote :
Revision history for this message
Julian Wiedmann (jwiedmann) wrote :

albatros:
could you try a current Maverick kernel? The patch was included in 2.6.35-25.44 (which was released a month ago).

Revision history for this message
Zsolt Horváth (zsolt-horvath-gmail) wrote :

After upgrading to 2.6.35-25.44 error messages seem to have disappeared, system still stable (WD20EARS-00M attached to a mobo with VT6420)

Revision history for this message
Andy Whitcroft (apw) wrote :

Based on the testing results, I am also closing this fixed for Maverick.

Changed in linux (Ubuntu Maverick):
status: In Progress → Fix Released
Revision history for this message
albatros (jda) wrote :

The issue has been fixed in 2.6.35-25.44. I had not found the bug in the changelog of the regular Maverick kernels, so I had not dared to try these kernels. Thanks

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.