Bug #60764 “Large file transfer gives error: Corrupted MAC on in...” : Bugs : linux-source-2.6.17 package : Ubuntu

Revision history for this message

Mika Fischer (zoop) wrote on 2006-09-18:

#1

Well, I can't seem to reproduce this after the latest kernel update of dapper.

So you can consider this fixed.

Revision history for this message

Mika Fischer (zoop) wrote on 2006-09-19:

#2

Correction: I can reproduce it, it just takes longer until the error occurs. So it's not fixed after all...

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-02-07:

#3

Thanks for your bug report. I was wondering if this is still an issue for you. If it is approximately how large of a file are you copying? Furthermore do you notice anything in 'dmesg' when the connection is lost? Thanks in advance.

Changed in openssh:
assignee:	nobody → brian-murray
status:	Unconfirmed → Needs Info

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#4

Unfortunately it's still an issue.

I first suspected the WLAN router but now the machines are directly connected by a switch and the problem still occurs...

I initially noticed the problem with a video file taken with a digicam. I don't know exactly how big it was, probably around 50 MB. But the problem occurs rather randomly. The longer the file the higher the probability that the error occurs. As it is, I cannot even transmit a 20 MB file this way.

Well. If you know a way to debug exactly where the MAC gets corrupted, I could try this. Other than that I don't know how this can be resolved.

I'll also see if I can get my hand on another network adapter and see if that changes anything.

In the meantime thanks for your time!

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-02-07:

#5

Perhaps using scp in verbose mode would more informative. The switch for verbose is '-v' so could you try 'scp -v'? Also is there anything in your kernel log around the time when these errors occur?

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#6

Ah, sorry. I forgot to say that there's nothing in the kernel log, neither on the client nor on the server.

I've tried running the server and the client at LogLevel DEBUG3 and will attach the logs.

I also inserted some debug output which gives this additional information (as an example, the actual MACs differ of course each time) from the client.

Expected MAC: fe 89 0d f5 b8 a4 32 2e ff 50 c3 32 38 62 4c 84
Received MAC: 9d df fb 36 7b d3 7d 43 51 e9 92 9f 74 1e 20 c2

I can't really find a pattern here...

So maybe something really fishy is going on hardware-wise. I'll try replacing the NIC and checking the RAM of that machine...

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#7

Logfile from the client Edit (18.3 KiB, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#8

Logfile from the server Edit (19.1 KiB, text/plain)

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-02-07:

#9

Could you please add information regarding the type of network adapter on both machines? ('lspci -vvn') Additionally if you could add 'sudo ethtool eth0' where eth0 is the network interface being used in the file transfer that would help. Thanks again.

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#10

lspci output from the server Edit (6.7 KiB, text/plain)

On the server the NIC has PCI id 00:0a.0.

On the client I actually have no idea which one of the PCI devices corresponds to the NIC because it's an onboard one...

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#11

lspci output from the client Edit (14.5 KiB, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#12

ethtool output from the server Edit (482 bytes, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-07:

#13

ethtool output from the client Edit (472 bytes, text/plain)

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-02-07:

#14

Thanks for updating the bug report. Come to find out the output of 'ethtool -k' would be more informative. Could you add that too? I apologize for the mistake.

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#15

ethtool -k output from the server Edit (319 bytes, text/plain)

No problem :)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#16

ethtool -k output from the client Edit (261 bytes, text/plain)

Revision history for this message

Kyle McMartin (kyle) wrote on 2007-02-08:

#17

This is extremely strange. Could you attach the dmesg from both the client and server, and the output of ifconfig from both? (Feel free to edit out any private IPs or anything like that)

You say you saw this when the client was on wireless and also tried plugging the client into the switch (so it's probably not a driver problem on the client then?)

Cheers,
Kyle

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#18

Come to think about it, I can pretty much rule out the client in this. This is a completely different machine than the one I used when I first reported this bug...

It also can't be something in the network infrastructure because it was the same when I wasn't using the switch but the server was directly connected to the WLAN router...

So the problem has to lie on the server-side. My guess is still some hardware issue...

I'll attach the info you asked for.

Regards,
Mika

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#19

dmesg output from the server Edit (16.4 KiB, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#20

dmesg output from the client Edit (23.8 KiB, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#21

ifconfig output from the server Edit (1022 bytes, text/plain)

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-08:

#22

ifconfig output from the client Edit (1.0 KiB, text/plain)

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-02-16:

#23

To eliminate ssh from the equation I was wondering if you could test doing a file transfer with netcat. Here is an example of that:

At the server console:

$ nc -v -w 30 -p 5600 -l > filename.back

and on the client side:

$ nc -v -w 2 10.0.1.1 5600 < filename

The file named filename is being sent from the client to the server on port 5600 and the server is writing it to disk as filename.back. You could read more about using netcat in this little article:

http://www.oreillynet.com/pub/h/1058

Please let us know what you find out. Thanks in advance.

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-17:

#24

Very good idea!

As it turns out you're right. The same thing happens with netcat. Also only when the broken computer acts as server. The other way round works fine...

05fab97be7fd5e7c9229187c24c89ea0 test.bin.orig
05fab97be7fd5e7c9229187c24c89ea0 test.bin.m2s
7dcb7bef6d1af049bd63fcf6d180685e test.bin.s2m

I guess I'll just get a new NIC and see if this helps...

Any idea what else could be the cause of this?

And thanks a lot for helping me debug this!

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-20:

#25

I switched the NIC with another one of the same type and put it into another PCI slot. Didn't change anything...

Then I let memtest86+ run and it also didn't detect anything.

I then lowered the bus clock frequency from 133 to 100 MHz. Also no effect.

So I'm quite stuck here. The only thing I can say is that it's not openssh related...

If you have any ideas what else I could try, please let me know...

Brian Murray (brian-murray) on 2007-02-20

Changed in openssh:
assignee:	brian-murray → nobody
status:	Needs Info → Confirmed

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-20:

#26

OK. I also tried only using one memory module at a time. Still no luck.

I then tried to rule out the obvious by connecting the computers using a crossover-cable. No change...

I then tried transmitting a file consisting only of zero bytes. This surprisingly worked.

I then discovered vbindiff and used that to check what exactly had changed in the corrupted file.

The corruptions occur at different places each time. But it's always complete words (I'm using 32 bit Ubuntu) that get corrupted. In total it's about 4-6 corruptions in a 50MB file.

Is there another kernel with that I could try without messing the system up too much?

Or anything else I could try?

Revision history for this message

Mika Fischer (zoop) wrote on 2007-02-20:

#27

The kernel used is actually 2.6.17-10-generic version 2.6.17.1-10.34 from edgy

Revision history for this message

David Sedeño Fernandez (david-alderia) wrote on 2007-03-05:

#28

Hi,

I have the same issue with a dapper server. In my case not only happen with scp to another machine. Sometimes just after login in via ssh.

In the same switch there are anothers ubuntu dapper server without problem, so I think is a driver issue. The machine have two same nics and in both occurs the same issue.

The nics are Intel 82546EB Gigabit Ethernet in a Dell Poweredget 650 and the driver is e1000

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-03-05:

#29

Mika - I don't easily see what network driver you are using. Is it the e1000 also? Thanks in advance.

Revision history for this message

Mika Fischer (zoop) wrote on 2007-03-05:

#30

I'm afraid it's a different card: 3Com PCI 3c905C Tornado

Revision history for this message

Mika Fischer (zoop) wrote on 2007-03-05:

#31

Oh, and the driver is 3c59x.

Revision history for this message

stefanolodi (slodi) wrote on 2007-05-17:

#32

On of the host I manage has been affected by the problem discussed in this thread for months now. It occurs both when transferring files with scp and during ssh sessions, however at a time from the start of transfer or of session that appears to be random. It's avery annying problem: some days it virtually impossible to keep connected for more than a fes minutes.

OS:

Linux myhost 2.6.18-4-k7 #1 SMP Mon Mar 26 17:57:15 UTC 2007 i686 GNU/Linux

As to NIC hardware

> dmesg | grep 3C
0000:00:0b.0: 3Com PCI 3c905B Cyclone 100baseTx at f081c000.

> lsmod |grep 3c5
3c59x 40808 0
mii 5696 1 3c59x

If anyone could suggest a test I'd be happy to carry it out. (I have not tried netcat yet; asap I will).

Revision history for this message

Pablo Noguera Crespo (pnoguera) wrote on 2007-05-30:

#33

I had the same problem with my LOM NIC Marvell 88E8001. I´m quite sure it was a hardware/driver issue, but in my case I solved it disabling the offload checksum on the NIC with the following command:

ethtool -K eth0 rx off tx off

I hope this could help

Revision history for this message

Mika Fischer (zoop) wrote on 2007-05-30:

#34

Unfortunately my card does not support this :(

$ sudo ethtool -K eth0 tx off
Cannot set device tx csum settings: Operation not supported
$ sudo ethtool -K eth0 rx off
Cannot set device rx csum settings: Operation not supported

Revision history for this message

Brian Murray (brian-murray) wrote on 2007-12-12:

#35

I am assigning this bug to the 'ubuntu-kernel-team' per their bug policy. For future reference you can learn more about their bug policy at https://wiki.ubuntu.com/KernelTeamBugPolicies .

Changed in linux-source-2.6.17:
assignee:	nobody → ubuntu-kernel-team

Revision history for this message

Johan Christiansen (johandc) wrote on 2008-02-28:

#36

I have this same problem on a Hardy laptop running:

02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5751M Gigabit Ethernet PCI Express (rev 11)
Subsystem: IBM Unknown device 0577
Flags: bus master, fast devsel, latency 0, IRQ 17
Memory at a0100000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at <ignored> [disabled]
Capabilities: [48] Power Management version 2
Capabilities: [50] Vital Product Data
Capabilities: [58] Message Signalled Interrupts: Mask- 64bit+ Queue=0/3 Enable-
Capabilities: [d0] Express Endpoint IRQ 0

Using:

johan@johan-laptop:~$ sudo ethtool -i eth0
driver: tg3
version: 3.86
firmware-version: 5751m-v3.40a
bus-info: 0000:02:00.0

with:

johan@johan-laptop:~$ sudo ethtool -k eth0
Offload parameters for eth0:
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off

The problem only exists when i'm using my broadcom interface, when i use the built in wireless interface everything seems okay. I'll try turning off some of the offloading and reporting back if it worked.

Revision history for this message

Johan Christiansen (johandc) wrote on 2008-02-28:

#37

I can confirm that turning offload parameters to "off" solves the issue. What might be wrong here?

Leann Ogasawara (leannogasawara) on 2008-03-28

Changed in linux:
assignee:	nobody → ubuntu-kernel-team
importance:	Undecided → Medium
status:	New → Triaged

Revision history for this message

Sergio Zanchetta (primes2h) wrote on 2008-07-14:

#38

The 18 month support period for Edgy Eft 6.10 has reached it's end of life. As a result, we are closing the linux-source-2.6.17 Edgy Eft kernel task. However, please note that this report will remain open against the actively developed kernel. Thank you for your continued support and help as we debug this issue.

Changed in linux-source-2.6.17:
status:	Confirmed → Invalid

Revision history for this message

Leann Ogasawara (leannogasawara) wrote on 2008-08-28:

#39

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message

Johan Christiansen (johandc) wrote on 2008-09-08:

#40

This bug is still present in Intrepid Ibex.

running:
sudo ethtool -K eth0 rx off tx off

fixes the problem.

Revision history for this message

Andreas Heinchen (andreas-heinchen) wrote on 2008-10-29:

#41

I am running Ubuntu 8.04 here at the moment. And I have this "Corrupted MAC on input" issue here too when using ssh. I even had it when ssh'ing to the same machine. I did

> ssh -X otheruser@localhost

and the ssh connection over the loop back device also got disconnected due to the MAC issue. Hope this helps to pin down the source of the problem.

Revision history for this message

henrikkirk (henrik-busywait) wrote on 2008-11-19:

#42

Running sudo ethtool -K eth0 rx off tx of only gives me an error

henrik@qui-gon:~$ sudo ethtool -K eth0 rx off tx off
Cannot set device rx csum settings: Operation not supported

Im not sure what this does exatly, so im sorry I cant give any more details.

Upgrade today
henrik@qui-gon:~$ ssh -V
OpenSSH_5.1p1 Debian-3ubuntu1, OpenSSL 0.9.8g 19 Oct 2007
henrik@qui-gon:~$ uname -r
2.6.27-7-generic

Stille gives the same problems as recorded above. When doing the transfer to localhost instead of a different machine, it works nice and smooth.

henrik@qui-gon:~$ scp -rv local_music/* obi:/home/henrik/torrent/files/
Executing: program /usr/bin/ssh host obi, user (unspecified), command scp -v -r -d -t /home/henrik/torrent/files/
OpenSSH_5.1p1 Debian-3ubuntu1, OpenSSL 0.9.8g 19 Oct 2007
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to obi [XX.XXX.XX.XXX] port 22.
debug1: Connection established.
debug1: identity file /home/henrik/.ssh/identity type -1
debug1: identity file /home/henrik/.ssh/id_rsa type -1
debug1: identity file /home/henrik/.ssh/id_dsa type 2
debug1: Checking blacklist file /usr/share/ssh/blacklist.DSA-1024
debug1: Checking blacklist file /etc/ssh/blacklist.DSA-1024
debug1: Remote protocol version 2.0, remote software version OpenSSH_4.3p2 Debian-9etch3
debug1: match: OpenSSH_4.3p2 Debian-9etch3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.1p1 Debian-3ubuntu1
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'obi' is known and matches the RSA host key.
debug1: Found key in /home/henrik/.ssh/known_hosts:14
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Trying private key: /home/henrik/.ssh/identity
debug1: Trying private key: /home/henrik/.ssh/id_rsa
debug1: Offering public key: /home/henrik/.ssh/id_dsa
debug1: Server accepts key: pkalg ssh-dss blen 435
debug1: read PEM private key done: type DSA
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug1: Requesting <email address hidden>
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env LANG = en_DK.utf8
debug1: Sending command: scp -v -r -d -t /home/henrik/torrent/files/
Sending file modes: C0644 188109766 Beatnik Beats 11.15.08.mp3
Sink: C0644 188109766 Beatnik Beats 11.15.08.mp3
Beatnik Beats 11.15.08.mp3 7% 13MB 851.8KB/s 03:19 ETAReceived disconnect from XX.XXX.XX.XXX: 2: Corrupted MAC on input.
lost connection

Hope this helps.

Best regards
/Henrik Kirk

Running sudo ethtool -K eth0 rx off tx of only gives me an error

henrik@qui-gon:~$ sudo ethtool -K eth0 rx off tx off
Cannot set device rx csum settings: Operation not supported

Im not sure what this does exatly, so im sorry I cant give any more details.

Upgrade today 
henrik@qui-gon:~$ ssh -V
OpenSSH_5.1p1 Debian-3ubuntu1, OpenSSL 0.9.8g 19 Oct 2007
henrik@qui-gon:~$ uname -r
2.6.27-7-generic

Stille gives the same problems as recorded above. When doing the transfer to localhost instead of a different machine, it works nice and smooth.

henrik@qui-gon:~$ scp -rv local_music/* obi:/home/henrik/torrent/files/
Executing: program /usr/bin/ssh host obi, user (unspecified), command scp -v -r -d -t /home/henrik/torrent/files/
OpenSSH_5.1p1 Debian-3ubuntu1, OpenSSL 0.9.8g 19 Oct 2007
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to obi [XX.XXX.XX.XXX] port 22.
debug1: Connection established.
debug1: identity file /home/henrik/.ssh/identity type -1
debug1: identity file /home/henrik/.ssh/id_rsa type -1
debug1: identity file /home/henrik/.ssh/id_dsa type 2
debug1: Checking blacklist file /usr/share/ssh/blacklist.DSA-1024
debug1: Checking blacklist file /etc/ssh/blacklist.DSA-1024
debug1: Remote protocol version 2.0, remote software version OpenSSH_4.3p2 Debian-9etch3
debug1: match: OpenSSH_4.3p2 Debian-9etch3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.1p1 Debian-3ubuntu1
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'obi' is known and matches the RSA host key.
debug1: Found key in /home/henrik/.ssh/known_hosts:14
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Trying private key: /home/henrik/.ssh/identity
debug1: Trying private key: /home/henrik/.ssh/id_rsa
debug1: Offering public key: /home/henrik/.ssh/id_dsa
debug1: Server accepts key: pkalg ssh-dss blen 435
debug1: read PEM private key done: type DSA
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env LANG = en_DK.utf8
debug1: Sending command: scp -v -r -d -t /home/henrik/torrent/files/
Sending file modes: C0644 188109766 Beatnik Beats 11.15.08.mp3
Sink: C0644 188109766 Beatnik Beats 11.15.08.mp3
Beatnik Beats 11.15.08.mp3                      7%   13MB 851.8KB/s   03:19 ETAReceived disconnect from XX.XXX.XX.XXX: 2: Corrupted MAC on input.
lost connection

Hope this helps.

Best regards
/Henrik Kirk

Revision history for this message

Brian C (brianwc) wrote on 2008-12-17:

#43

I get this problem when running rdiff-backup (which uses ssh) between two machines both running Debian Lenny, with kernel 2.6.26-1-amd64 #1 SMP Wed Nov 26 18:26:02 UTC 2008 x86_64 GNU/Linux. The server machine has a Macronix ethernet device using the tulip driver. I also can solve the problem by doing ethtool -K eth0 rx off tx off although I don't know what that does and whether I should be worried about turning that off.

Anyway, whatever the larger issue is, it occurs on both Ubuntu and Debian.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2008-12-23: Kernel team bugs

#44

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message

Vladimír Lapáček (vil) wrote on 2009-01-28:

#45

I get the same problem: "Corrupted MAC on input" when running on Intrepid on Lenovo Ideapad S10e connected via ethernet.

Revision history for this message

tuxo (beat-fasel) wrote on 2009-04-06:

#46

I have the same problem on Jaunty Jackalope Beta 9.04 with the following network card:
Ethernet controller: Attansic Technology Corp. L1e Gigabit Ethernet Adapter (rev b0).

I stumbled upon this error while doing a large file transfer using scp.

Revision history for this message

mindfuck (mindfuck) wrote on 2009-04-07:

#47

I can also confirm this bug when running Intrepid on an Lenovo Ideapad S10e and trying to move files to another computer via ssh over the ethernet interface. The command "sudo ethtool -K eth0 rx off tx off" fixes the problem.

Revision history for this message

Vladimír Lapáček (vil) wrote on 2009-04-08: Re: [Bug 60764] Re: Large file transfer gives error: Corrupted MAC on input

#48

Great finding. I can confirm that the command from gloawu fixes the problem.
My curiosity drives me to ask how did you find this out?

Is there possibly anything that we can do to get this fixed in the upstream?

Thanks.

On Tue, Apr 7, 2009 at 7:15 PM, gloawu <email address hidden> wrote:

> I can also confirm this bug when running Intrepid on an Lenovo Ideapad
> S10e and trying to move files to another computer via ssh over the
> ethernet interface. The command "sudo ethtool -K eth0 rx off tx off"
> fixes the problem.
>
> --
> Large file transfer gives error: Corrupted MAC on input
> https://bugs.launchpad.net/bugs/60764
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message

mindfuck (mindfuck) wrote on 2009-04-08:

#49

Unfortunately I didn't figure it out myself. The command was posted here by Johan Christiansen on 2008-09-08 . Maybe he can go into more detail on this.
You're welcome

Revision history for this message

Johan Christiansen (johandc) wrote on 2009-04-08:

#50

Yes, this is indeed a very embarrassing driver problem, where the hardware TCP offloading in the driver seems to corrupt frames that use SSL. The bug has been reported over 2½ year ago, and nobody seem to have what it takes to get it fixed, or the guts to increase the priority so it will reach the right people.

This affects many PC's, i therefore think that TCP offloading should be disabled per. default until the bug gets fixed in the kernel.

Manoj Iyer (manjo) on 2009-04-21

tags:

added: ct-rev

Revision history for this message

Dan Kegel (dank) wrote on 2009-06-24:

#51

I'm seeing this in Jaunty on a Lenovo laptop. lspci says
02:00.0 Ethernet controller: Broadcom Corporation NetLink BCM5906M Fast Ethernet PCI Express (rev 02)
The symptoms are so severe that running google-chrome over ssh dies within a few seconds.
Happily, the workaround "sudo ethtool -K eth0 rx off tx off" seems to work.

Revision history for this message

Angelo Corsaro (angelo-corsaro) wrote on 2009-07-03:

#52

I solved the problem changing the minipciex card on my Acer AspireOne with an Intel 3945ABG. Yesterday, for the first time and _without_ errors, I transferred a 1.7 G. I'm thinking that this bug is left opened intentionally. I'm agree with Johan Christiansen, two years and a half is enough to solve this problem that is not affecting the Window$ environment.. mmmhhh strange ;-)
I tried to apply the latest workaround (TCP offloading), but without success. My old card was a Atheros 5007.

Revision history for this message

SunBlade (septimus-severus) wrote on 2009-07-21:

#53

I an experiencing the same problem with one of my Systems
I am using a rather old Laptop (Pentium MMx-233) with Debian Lenny as download server.
Originally this Laptop only had a 10 MBit D-Link PCMCIA-NIC. With this card there were no problems at all.
Since i replaced it with a (rather slow) Realtec 100MBit NIC, the problem is arising sporadically, but seldom and only during transfers of the downloaded data to may bigger Machines at full speed. The problem is arising during SSH-tranfers, as well as during NFS transfers. While downloadiung from the Internet wqith speed of max. 200 kB/s the problem never will show.

Usually i am starting the ssh-client on the Laptop. For NFS the laptop is exporting its inbound directories to the other machines. This bug can be reproduced transferrind data to any of may other machines (2 PCs w. Debian Lenny/Squeeze) and a few SUNs.

When i tried to replace the Realtec-NIC with a really fast 3COM 3CXFE575, the problem got worse.
I still could log in remotely, but transfers would fail.

I suppose a buffering problem, i.e. a ring-buffer overflow in the kernel-code.

Revision history for this message

burianek (burianek) wrote on 2009-08-19:

#54

workaround script /etc/network/if-up.d/broadfix Edit (83 bytes, text/plain)

I made this automatic workaround for my firend with Lenovo S10. It's based on altering eth with ethtool.

Place this script in
/etc/network/if-up.d/broadfix
Make it executable
sudo chmod +x /etc/network/if-up.d/broadfix
Restart

---
You may want to specify which eth you want to alter.
eth0 and eth1 are default in condition
   ...
   if [[ "$IFACE" == eth[01] ]]; then
   ...

Revision history for this message

burianek (burianek) wrote on 2009-08-19:

#55

workaround script /etc/network/if-up.d/broadfix Edit (83 bytes, text/plain)

I made this automatic workaround for my firend with Lenovo S10. It's based on altering eth with ethtool.

Place this script in
/etc/network/if-up.d/broadfix
Make it executable
sudo chmod +x /etc/network/if-up.d/broadfix
Restart

---
You may want to specify which eth you want to alter.
eth0 and eth1 are default in condition
   ...
   if [[ "$IFACE" == eth[01] ]]; then
   ...

Revision history for this message

drew einhorn (drew-einhorn) wrote on 2009-09-26:

#56

Hmm. I seeing this problem with scp of large files to a jaunty box.

[ 0.000000] Linux version 2.6.28-15-generic (buildd@palmer) (gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) ) #49-Ubuntu SMP Tue Aug 18 18:40:08 UTC 2009 (Ubuntu 2.6.28-15.49-generic)

Unfortunately the turning off checksumming does not work.

drew@test:~$ sudo ethtool -K eth0 tx off
Cannot set device tx csum settings: Operation not supported
drew@test:~$ sudo ethtool -K eth0 rx off
Cannot set device rx csum settings: Operation not supported
drew@test:~$

Could it be that the kernel is hardcoded to us hardware checksumming,
and turning it off is not a option?

Revision history for this message

Sebastian Thürrschmidt (thuerrschmidt) wrote on 2009-10-23:

#57

This is still an issue in Karmic (RC). On my Lenovo S10e I'm still getting errors when copying large files to my Jaunty fileserver via the machine's LAN interface using SSH(FS). Fortunately the workaround suggested by burianek on 2009-08-19 helps in Karmic too. (BTW, a restart is not required after creating the script in /etc/network/if-up.d/; simply re-plugging the network cable will do.)

Revision history for this message

Lloyd (lloyd-reijers) wrote on 2009-11-02:

#58

I can confirm that this issue still exists in Karmic (9.10) [actual, not RC] on a lenovo S10e

Fortunately for me the fix suggested by Johan Christiansen on 2008-09-08 works on this hardware.

chckcc (t-steenkamp) on 2009-11-18

Changed in linux (Ubuntu):
status:	Triaged → Confirmed

Revision history for this message

Sergio Zanchetta (primes2h) wrote on 2009-11-18:

#59

Please don't change status if you don't know what you are doing.
https://wiki.ubuntu.com/Bugs/Status
Thank you.

Changed in linux (Ubuntu):
status:	Confirmed → Triaged

Revision history for this message

Tero Jänkä (graytron) wrote on 2009-11-21:

#60

Download full text (4.9 KiB)

I can confirm this "Corrupted MAC on input" bug on i386 desktop version of Ubuntu 9.10 karmic. Silent file corruption also happens when downloading files using HTTP or FTP protocols. This bug is easily reproducible.

The computer on which this bug manifests itself is an Asus Eee PC 1000HE with 2 GB of RAM and an Atheros AR8121/AR8113/AR8114 PCI-E Ethernet Controller (1969:1026). I tried upgrading the 1000HE AMI BIOS from version 0607 to 1002, but that didn't help.

- $ uname -a
Linux eeepc 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:54:29 UTC 2009 i686 GNU/Linux

- $ sudo lspci -vvvn
03:00.0 0200: 1969:1026 (rev b0)
        Subsystem: 1043:8324
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 27
        Region 0: Memory at fbfc0000 (64-bit, non-prefetchable) [size=256K]
        Region 2: I/O ports at ec00 [size=128]
        Capabilities: [40] Power Management version 2
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
                Address: 00000000fee0300c Data: 417a
        Capabilities: [58] Express (v1) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                        ExtTag- AttnBtn+ AttnInd+ PwrInd+ RBE- FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Suprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        Capabilities: [6c] Vital Product Data <?>
        Capabilities: [100] Advanced Error Reporting <?>
        Capabilities: [180] Device Serial Number <EDITED OUT>
        Kernel driver in use: ATL1E
        Kernel modules: atl1e

- $ sudo ethtool -i eth0
driver: ATL1E
version: 1.0.0.7-NAPI
firmware-version: L1e
bus-info: 0000:03:00.0

- $ sudo ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes: 10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes: 10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Tran...

Affects		Status	Importance	Assigned to
	linux (Ubuntu)	Won't Fix	Medium	Unassigned
Nominated for Karmic by Dan Kegel
	linux-source-2.6.17 (Ubuntu)	Invalid	Undecided	Unassigned
Nominated for Karmic by Dan Kegel

Ubuntu
linux-source-2.6.17 package

Large file transfer gives error: Corrupted MAC on input

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux-source-2.6.17 package

Large file transfer gives error: Corrupted MAC on input

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux-source-2.6.17 package