sky2 hardware hung? flushing (System crash (reproduceable))

Bug #81713 reported by Anders
This bug report is a duplicate of:  Bug #68338: sky2 driver stalls. Edit Remove
2
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
Nominated for Feisty by Anders
linux-source-2.6.17 (Ubuntu)
Won't Fix
Undecided
Unassigned
Nominated for Feisty by Anders

Bug Description

Binary package hint: linux-source-2.6.17

I've had 2 occurences of an overnight complete computer hang (no keyboard/mouse action brings it out of black screen when I come back to it.

Both times the computer has been left on running KTorrent, both times the computer has logged the following to /var/log/syslog: [17230758.016000] NETDEV WATCHDOG: eth0: transmit timed out

Snippets:

Jan 23 08:17:01 plasken /USR/SBIN/CRON[29035]: (root) CMD ( run-parts --report /etc/cron.hourly)
Jan 23 08:40:10 plasken -- MARK --
Jan 23 09:00:10 plasken -- MARK --
Jan 23 09:17:01 plasken /USR/SBIN/CRON[30626]: (root) CMD ( run-parts --report /etc/cron.hourly)
Jan 23 09:24:08 plasken kernel: [17226576.348000] NETDEV WATCHDOG: eth0: transmit timed out
Jan 23 09:24:08 plasken kernel: [17226576.348000] sky2 eth0: tx timeout
Jan 23 09:24:08 plasken kernel: [17226576.348000] sky2 eth0: transmit ring 152 .. 129 report=152 done=152
Jan 23 09:24:08 plasken kernel: [17226576.348000] sky2 hardware hung? flushing
Jan 23 09:31:58 plasken kernel: [17227046.348000] NETDEV WATCHDOG: eth0: transmit timed out
Jan 23 09:31:58 plasken kernel: [17227046.348000] sky2 eth0: tx timeout
Jan 23 09:31:58 plasken kernel: [17227046.348000] sky2 eth0: transmit ring 129 .. 106 report=152 done=152
Jan 23 09:31:58 plasken kernel: [17227046.348000] sky2 status report lost?
Jan 23 09:32:18 plasken kernel: [17227066.348000] NETDEV WATCHDOG: eth0: transmit timed out
Jan 23 09:32:18 plasken kernel: [17227066.348000] sky2 eth0: tx timeout
Jan 23 09:32:18 plasken kernel: [17227066.348000] sky2 eth0: transmit ring 152 .. 129 report=152 done=152
Jan 23 09:32:18 plasken kernel: [17227066.348000] sky2 hardware hung? flushing
Jan 23 09:32:33 plasken kernel: [17227081.764000] sky2 eth0: rx error, status 0x7ffc0001 length 564
Jan 23 09:32:33 plasken kernel: [17227081.764000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:33 plasken kernel: [17227081.764000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c012bb70> lock_timer_base+0x20/0x50
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c0272dfb> net_rx_action+0xbb/0x190 <c0127842> __do_softirq+0x72/0xe0
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c01278e5> do_softirq+0x35/0x40 <c010413c> apic_timer_interrupt+0x1c/0x30
Jan 23 09:32:33 plasken kernel: [17227081.764000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:33 plasken kernel: [17227081.764000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c012bb70> lock_timer_base+0x20/0x50
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c0272dfb> net_rx_action+0xbb/0x190 <c0127842> __do_softirq+0x72/0xe0
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c01278e5> do_softirq+0x35/0x40 <c010413c> apic_timer_interrupt+0x1c/0x30
Jan 23 09:32:33 plasken kernel: [17227081.764000] Attempt to release alive inet socket f7981080
Jan 23 09:32:33 plasken kernel: [17227081.764000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:33 plasken kernel: [17227081.764000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c012bb70> lock_timer_base+0x20/0x50
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c0272dfb> net_rx_action+0xbb/0x190 <c0127842> __do_softirq+0x72/0xe0
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c01278e5> do_softirq+0x35/0x40 <c010413c> apic_timer_interrupt+0x1c/0x30
Jan 23 09:32:33 plasken kernel: [17227081.764000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:33 plasken kernel: [17227081.764000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c012bb70> lock_timer_base+0x20/0x50
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c0272dfb> net_rx_action+0xbb/0x190 <c0127842> __do_softirq+0x72/0xe0
Jan 23 09:32:33 plasken kernel: [17227081.764000] <c01278e5> do_softirq+0x35/0x40 <c010413c> apic_timer_interrupt+0x1c/0x30
Jan 23 09:32:33 plasken kernel: [17227081.772000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:33 plasken kernel: [17227081.772000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:33 plasken kernel: [17227081.772000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c0272dfb> net_rx_action+0xbb/0x190
Jan 23 09:32:33 plasken kernel: [17227081.772000] <c0127842> __do_softirq+0x72/0xe0 <c01278e5> do_softirq+0x35/0x40
Jan 23 09:32:33 plasken kernel: [17227081.772000] <c0105c8e> do_IRQ+0x1e/0x30 <c010408a> common_interrupt+0x1a/0x20
Jan 23 09:32:33 plasken kernel: [17227081.772000] <c01021b9> mwait_idle+0x29/0x40 <c0102122> cpu_idle+0x42/0xb0
Jan 23 09:32:33 plasken kernel: [17227081.772000] <c03f07a1> start_kernel+0x321/0x3a0 <c03f0210> unknown_bootoption+0x0/0x270
Jan 23 09:32:35 plasken kernel: [17227082.812000] BUG: warning at include/net/dst.h:153/dst_release()
Jan 23 09:32:35 plasken kernel: [17227082.812000] <c026cade> __kfree_skb+0xfe/0x110 <f8a03fc4> sky2_tx_complete+0xa4/0x140 [sky2]
Jan 23 09:32:35 plasken kernel: [17227082.812000] <f8a05f1c> sky2_poll+0x75c/0x960 [sky2] <c0272dfb> net_rx_action+0xbb/0x190
Jan 23 09:32:35 plasken kernel: [17227082.812000] <c0127842> __do_softirq+0x72/0xe0 <c01278e5> do_softirq+0x35/0x40
Jan 23 09:32:35 plasken kernel: [17227082.812000] <c0105c8e> do_IRQ+0x1e/0x30 <c010408a> common_interrupt+0x1a/0x20
Jan 23 09:32:35 plasken kernel: [17227082.812000] <c01021b9> mwait_idle+0x29/0x40 <c0102122> cpu_idle+0x42/0xb0
Jan 23 09:32:35 plasken kernel: [17227082.812000] <c03f07a1> start_kernel+0x321/0x3a0 <c03f0210> unknown_bootoption+0x0/0x270
Jan 23 19:52:56 plasken syslogd 1.4.1#18ubuntu6: restart.
Jan 23 19:52:56 plasken kernel: Inspecting /boot/System.map-2.6.17-10-generic

2nd occurrence:

Jan 24 09:16:17 plasken ntpd[5263]: synchronized to 130.88.200.98, stratum 2
Jan 24 09:17:01 plasken /USR/SBIN/CRON[20448]: (root) CMD ( run-parts --report /etc/cron.hourly)
Jan 24 09:26:02 plasken ntpd[5263]: synchronized to 130.159.196.118, stratum 2
Jan 24 09:38:31 plasken -- MARK --
Jan 24 09:58:31 plasken -- MARK --
Jan 24 10:05:21 plasken kernel: [17230758.016000] NETDEV WATCHDOG: eth0: transmit timed out
Jan 24 10:05:21 plasken kernel: [17230758.016000] sky2 eth0: tx timeout
Jan 24 10:05:21 plasken kernel: [17230758.016000] sky2 eth0: transmit ring 284 .. 262 report=284 done=284
Jan 24 10:05:21 plasken kernel: [17230758.016000] sky2 hardware hung? flushing
Jan 25 09:01:00 plasken syslogd 1.4.1#18ubuntu6: restart.
Jan 25 09:01:00 plasken kernel: Inspecting /boot/System.map-2.6.17-10-generic

POSSIBLY a duplicate of Bug #79198 but there is not enough info to tell.

Ubuntu Hardware Database reference: 4e589d05afce3c4457db0164c0e7f207

Revision history for this message
Anders (andersja+launchpad-net) wrote :

Additional hardware information: Output of sudo lshw

Revision history for this message
Anders (andersja+launchpad-net) wrote :

Another occurrence happened today, again I returned to completely black screen, computer completely hung, required computer on/off button to restart. Syslog extract from before/after incident:

Jan 28 15:41:10 plasken -- MARK --
Jan 28 16:01:11 plasken -- MARK --
Jan 28 16:11:37 plasken ntpd[7635]: synchronized to 134.214.100.6, stratum 2
Jan 28 16:13:51 plasken ntpd[7635]: synchronized to 130.88.200.98, stratum 2
Jan 28 16:14:37 plasken kernel: [17228786.172000] NETDEV WATCHDOG: eth0: transmit timed out
Jan 28 16:14:37 plasken kernel: [17228786.172000] sky2 eth0: tx timeout
Jan 28 16:14:37 plasken kernel: [17228786.172000] sky2 eth0: transmit ring 77 .. 54 report=77 done=77
Jan 28 16:14:37 plasken kernel: [17228786.172000] sky2 hardware hung? flushing
Jan 28 18:01:51 plasken syslogd 1.4.1#18ubuntu6: restart.
Jan 28 18:01:51 plasken kernel: Inspecting /boot/System.map-2.6.17-10-generic

Revision history for this message
jcfp (jcfp) wrote :

This bug has been reported before; for more information see the duplicate bug #68338. It seems that a fix has been uploaded and is in edgy-proposed with request for testing.

As a workaround for this bug you could use the sk98lin driver instead of sky2, by adding a blacklist entry for the latter to /etc/modprobe.d/blacklist and rebooting.

Revision history for this message
Kyle McMartin (kyle) wrote :

Please try the kernel in -proposed, which has an updated sky2 driver.

Revision history for this message
Sheer Pullen (sheer-panic) wrote :

I have a similar problem on a dapper system - *very* reproducable, whenever copying files from a gigabit host, takes about 5 minutes to hang. Ethernet interface is then unusuable, but machine stays stable.

Problem still exists in 2.6.17-50, the latest kernel I've been able to find .deb files for.

Trying the blacklist solution..

Well, the system still finds the card with that module blacklisted.. time will tell whether it locks or not, I've just started another attempt to copy all my mp3s from a server that is being retired to the machine in question (both have gig E cards, this has been a sure-fire way to get the machine incommunicado so far)

Revision history for this message
Launchpad Janitor (janitor) wrote : This bug is now reported against the 'linux' package

The 18 month support period for Edgy Eft 6.10 has reached its end of life. As a result, we are closing the linux-source-2.6.17 Edgy Eft kernel task. However, development has already began for the upcoming Intrepid Ibex 8.10 release. It would be helpful if you could test the upcoming release and verify if this is still an issue - http://www.ubuntu.com/testing . If the issue still exists, please update this report by changing the Status of the "linux" task from "Incomplete" to "New". We appreciate your patience and understanding as we make this transition. Thanks!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.