be2net driver used by HP BL460c bridge networking not working

Bug #1013199 reported by Pierre Amadio
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Unassigned

Bug Description

When using bridge networking under Ubuntu 12.04 LTS on HP C-Clase Blades (BL460c G7) which uses the be2net driver, networking doesn't work.
Networking without bridging (assigning an address directly to, say, eth0) does work.

To recreate

1. Configure eth0 with an address
2. Test ping to other server on same network
3. Works

Now take down the interface and configure bridging.
1. /etc/network/interfaces

# Bridge between eth0 and eth1
auto br0
iface br0 inet static
address 192.168.1.10
netmask 255.255.255.0
network 192.168.1.0
gateway 192.168.1.1
pre-up ip link set eth0 down
pre-up brctl addbr br0
pre-up brctl addif br0 eth0
pre-up ip addr flush dev eth0
post-down ip link set eth0 down
post-down ip link set br0 down
post-down brctl delif br0 eth0
post-down brctl delbr br0
2. Bring br0 up
3. Ping other server on network
4. Doesn't work

3.2.0-24-generic #39-Ubuntu SMP Mon May 21 16:52:17 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

02:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.1 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.2 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.3 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.4 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.5 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.6 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)
02:00.7 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01)

lsmod | grep be2net
be2net 78296 0

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image (not installed)
ProcVersionSignature: Ubuntu 3.2.0-24.39-generic 3.2.16
Uname: Linux 3.2.0-24-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu8
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: PCH [HDA Intel PCH], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: melmoth 2460 F.... pulseaudio
CRDA:
 country GB:
  (2402 - 2482 @ 40), (N/A, 20)
  (5170 - 5250 @ 40), (N/A, 20)
  (5250 - 5330 @ 40), (N/A, 20), DFS
  (5490 - 5710 @ 40), (N/A, 27), DFS
Card0.Amixer.info:
 Card hw:0 'PCH'/'HDA Intel PCH at 0xf2520000 irq 47'
   Mixer name : 'Intel CougarPoint HDMI'
   Components : 'HDA:14f1506e,17aa21da,00100002 HDA:80862805,80860101,00100000'
   Controls : 26
   Simple ctrls : 8
Card29.Amixer.info:
 Card hw:29 'ThinkPadEC'/'ThinkPad Console Audio Control at EC reg 0x30, fw unknown'
   Mixer name : 'ThinkPad EC (unknown)'
   Components : ''
   Controls : 1
   Simple ctrls : 1
Card29.Amixer.values:
 Simple mixer control 'Console',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Date: Thu Jun 14 16:31:39 2012
EcryptfsInUse: Yes
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
MachineType: LENOVO 4287CTO
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-24-generic root=UUID=ffd0d87e-afc2-4ccf-b2ca-0e7f3dd164d3 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic N/A
 linux-backports-modules-3.2.0-24-generic N/A
 linux-firmware 1.79
SourcePackage: linux
StagingDrivers: mei
UpgradeStatus: Upgraded to precise on 2012-05-14 (31 days ago)
dmi.bios.date: 11/01/2011
dmi.bios.vendor: LENOVO
dmi.bios.version: 8DET55WW (1.25 )
dmi.board.asset.tag: Not Available
dmi.board.name: 4287CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr8DET55WW(1.25):bd11/01/2011:svnLENOVO:pn4287CTO:pvrThinkPadX220:rvnLENOVO:rn4287CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 4287CTO
dmi.product.version: ThinkPad X220
dmi.sys.vendor: LENOVO

Revision history for this message
Pierre Amadio (pierre-amadio) wrote :
Revision history for this message
Pierre Amadio (pierre-amadio) wrote :
Revision history for this message
Pierre Amadio (pierre-amadio) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.5kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-rc2-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
tags: added: kernel-da-key
removed: needs-upstream-testing
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Pierre

Also, do you happen to know if there was a prior release that did not have this bug, such as Lucid? It would be very helpful to know what release introduced this bug, if it is in fact a regression.

tags: added: needs-upstream-testing
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

@Pierre,

One other question. I haven't tested all of your config settings to setup the bridge. I was just curious if you tried this config on a system with a driver other than be2net, just to confirm they are all set properly?

auto br0
iface br0 inet static
address 192.168.1.10
netmask 255.255.255.0
network 192.168.1.0
gateway 192.168.1.1
pre-up ip link set eth0 down
pre-up brctl addbr br0
pre-up brctl addif br0 eth0
pre-up ip addr flush dev eth0
post-down ip link set eth0 down
post-down ip link set br0 down
post-down brctl delif br0 eth0
post-down brctl delbr br0

Changed in linux (Ubuntu):
importance: Medium → High
tags: added: kernel-bug-exists-upstream
removed: needs-upstream-testing
Revision history for this message
Pierre Amadio (pierre-amadio) wrote :

- #4 same behaviour with a v3.5kernel
- #5 the problem was discovered after installing openstack on precise (compute nodes build bridge for the vm private network, and this network was not usable). Bridging has not been try out on a previous release such as Lucid.
-#6 currently it does not seem possible to perform a test with another driver as all the machine involved comes with the same hardware.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Also, it would be great if you could test the bridge settings on a prior release, such as Lucid. That will tell us if this is a regression, and help figure out when it was introduced.

Do you think you would be able to try some prior kernels to identify when the bug started happening?

Revision history for this message
Peter Matulis (petermatulis) wrote :

@Joe

A Lucid live session was attempted and it resulted in a worse failure: a normal interface could not be initialised.

tags: added: quantal
tags: added: kernel-key
tags: added: lucid
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

bug 717388 is similar. In that bug, the Maverick kernel did not have an issue. The Maverick, Natty, and Oneiric kernels will be tested to see if they exhibit this bug. If they do, we can perform a kernel bisect to find the commit that introduced the regression in Precise.

Revision history for this message
Stefan Bader (smb) wrote :

I have the feeling this is probably not related to the be2net hardware at all. I believe to remember vaguely that at some point in the past the behaviour if a bridge interface changed in the way that it will not be up until one of its ports is up. When I follow the exact description of /etc/network/interfaces in the report on a VM I get the same results (ping not working).
For my working bridge setups I also use the bridge extensions in e-n-i which I feel are simpler to handle (though the major difference there is that in the end eth0 is up together with the bridge):

auto eth0
iface eth0 inet manual

auto br0
iface br0 inet static
  address 192.168.1.2
  netmask 255.255.255.0
  network 192.168.1.0
  gateway 192.168.1.1
  bridge-ports eth0
  bridge-stp off
  bridge-fd 0
  bridge-maxwait 0

Somewhere I found notes about the MAC address of the bridge potentially changing to the lowest of the attached ports. Not sure this is really true but at least adding a "post up ip link set br0 address xx:xx:xx:xx:xx:xx" with the mac address of eth0 never did any harm.

Unrelated to this, with resolvconf in precise (12.04) it is advisable to slowly start moving to the new way of declaring dns. This is done by adding the following lines to the iface section of br0:

  dns-nameservers <ip> [<ip>]
  dns-search <domain>

Revision history for this message
Peter Matulis (petermatulis) wrote :

@Stefan

I'm accompanying an affected user and we are focusing on a manual method:

# ip link set dev eth0 down
# brctl addbr br0
# brctl addif br0 eth0
# ip link set dev eth0 up
# ip link set dev br0 up
# ip addr add 10.153.107.142/24 brd + dev br0
# ip route add default via 10.153.107.1

Observations using be2net:

1. Lucid cannot bring up eth0 nor br0.
2. Maverick, Natty, and Oneiric can bring up eth0 and br0 (Live CD; no updates).
3. Precise can bring up eth0 but not br0 (Live CD; no updates).

Revision history for this message
Edward Bustos (edward-bustos) wrote :

@Stefan

Using the updated bridging instructions provided by Stefan, I was able to bring up br0 on Precise.

auto eth0
iface eth0 inet manual
auto br0
iface br0 inet static
  address 192.168.1.2
  netmask 255.255.255.0
  network 192.168.1.0
  gateway 192.168.1.1
  bridge-ports eth0
  bridge-stp off
  bridge-fd 0

Tested on a BL460cG7 with VC Flex-10 modules

Nick Barcet (nijaba)
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Kevin Jackson (kevin-linuxservices) wrote :

This isn't about "bringing up a bridge interface" its about traffic flowing over the bridged interface that isn't working.
When the interface is bridged - I can not ping another machine on the network.
When I just use the actual interface, I can ping.

Revision history for this message
Kevin Jackson (kevin-linuxservices) wrote :

Repeating the steps to create the information in file does make bridged network work whereas in OpenStack when it manages the interface doesn't work. I'll test whether we can create this interface manually like this for OpenStack to work.

Revision history for this message
Chet Burgess (cfb-n) wrote :

I'm not entirely sure this bug is invalid.

We recently upgraded some of our HP blades and bridge networking stopped working with the the 3.2.0-31 Ubuntu 12.04 LTS Kernel and the be2net driver. I've tested a number of kernels and distros both with and without the be2net driver and I can definitively say that the 3.2.0-31 does not work.

I have confirmed that the following Ubuntu version and kernel work on the same HP blades with the be2net driver.

Oneiric:
3.0.0-17-server #30-Ubuntu SMP
3.0.0-19-server #33-Ubuntu SMP
3.0.0-22-server #36-Ubuntu SMP
3.0.0-23-server #39-Ubuntu SMP
3.0.0-25-server #41-Ubuntu SMP

Precise:
3.2.0-30-generic #48-Ubuntu SMP

Additionally I have tested the following kernels with the bnx2 driver and can confirm that they work without issue.

Oneiric:
3.0.0-17-server #30-Ubuntu SMP
3.0.0-19-server #33-Ubuntu SMP

Precise:
3.2.0-25-generic #40-Ubuntu SMP
3.2.0-29-generic #46-Ubuntu SMP
3.2.0-30-generic #48-Ubuntu SMP
3.2.0-31-generic #50-Ubuntu SMP

The only combination that does not work is the be2net driver and the 3.2.0-31 kernel. I have tried the proposed "solution" of configuring the bridges via /etc/network/interfaces with no luck.

Further testing with tcpdump reveals that the underlying physical interface is receiving the packets but they do not appear to be making it to the bridge.

I am happy to provide any elements of the configuration as well as tcpdumps or anything else that might be of assistance.

Revision history for this message
Kodamati Pradeep Vinesh Reddy (pradeep-reddy2) wrote :

With the below mentioned configuration, bridging works on our setup with 3.2.0-31

auto eth0
iface eth0 inet manual

auto br0
iface br0 inet dhcp
        bridge_ports eth0
        bridge_fd 9
        bridge_hello 2
        bridge_maxwait 12
        bridge_stp off

Kernel is 3.2.0-31-generic as mentioned below.

root@ubuntu:~# uname -r
3.2.0-31-generic

If you can send us your configuration may be we can try it on our setup.

Revision history for this message
Chet Burgess (cfb-n) wrote :

Interesting you have a few options set differently then I do. I will try your options shortly.

My configuration that isn't working is the following (just confirmed it again):

auto eth1
iface eth1 inet manual
 address 0.0.0.0

auto br0
iface br0 inet static
  address 10.42.1.28
  netmask 255.255.254.0
  network 10.42.4.0
  bridge-ports eth1
  bridge-stp off
  bridge-fd 0

root@XXX:/home/XXX$ uname -r
3.2.0-31-generic

Revision history for this message
Chet Burgess (cfb-n) wrote :

OK trying the following had not effect, still not working.

auto eth1
iface eth1 inet manual
 address 0.0.0.0

auto br0
iface br0 inet static
  address 10.42.1.28
  netmask 255.255.254.0
  network 10.42.4.0
  bridge_ports eth1
  bridge_fd 9
  bridge_hello 2
  bridge_maxwait 12
  bridge_stp off

Revision history for this message
Kodamati Pradeep Vinesh Reddy (pradeep-reddy2) wrote :

Your configuration works on our setup without any issues.

auto eth0
iface eth0 inet manual
        address 0.0.0.0

auto br0
#iface br0 inet dhcp
iface br0 inet static
        address 172.16.17.40
        netmask 255.255.0.0
        network 172.16.0.1
        bridge_ports eth0
        bridge_fd 0
        bridge_stp off

root@ubuntu:~# uname -r
3.2.0-31-generic

root@ubuntu:~# ping 172.16.16.51
PING 172.16.16.51 (172.16.16.51) 56(84) bytes of data.
64 bytes from 172.16.16.51: icmp_req=1 ttl=255 time=0.613 ms
64 bytes from 172.16.16.51: icmp_req=2 ttl=255 time=0.378 ms
64 bytes from 172.16.16.51: icmp_req=3 ttl=255 time=0.409 ms
^C
--- 172.16.16.51 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.378/0.466/0.613/0.107 ms

Can you please send us your routing configuration also if you have any?

Revision history for this message
Chet Burgess (cfb-n) wrote :

Hmm so either something has changed or my testing before was not 100% accurate.

Using the below configuration I am unable to ping anything. Additionally I can't talk to any resources via the default gateway (I have confirmed I can do this without bridging in play), however I just noticed that I can connect to the system via our admin network which is source via 172.31.252.0/22 and has a different gateway from our default gateway.

Once again here is my current /etc/network/interfaces

# The loopback network interface
auto lo
iface lo inet loopback
    post-up /usr/sbin/service dnsmasq start

#auto eth0
#iface eth0 inet static
# address 10.42.1.28
# netmask 255.255.254.0

auto eth0
iface eth0 inet manual
        address 0.0.0.0

auto br0
iface br0 inet static
        address 10.42.1.28
        netmask 255.255.254.0
 network 10.42.0.0
        bridge_ports eth0
        bridge_fd 0
        bridge_stp off

#auto br0
#iface br0 inet static
# address 10.42.1.28
# netmask 255.255.254.0
# network 10.42.0.0
# bridge_ports eth0
# bridge_fd 9
# bridge_hello 2
# bridge_maxwait 12
# bridge_stp off

Here is the output of ip route:
default via 10.42.0.1 dev br0 src 10.42.1.28
10.42.0.0/23 dev br0 proto kernel scope link src 10.42.1.28
172.31.240.0/21 via 10.42.0.2 dev br0
172.31.252.0/22 via 10.42.0.2 dev br0

Revision history for this message
Chet Burgess (cfb-n) wrote :

It looks like this has something to do with iptables (even if there are no iptables rules).

While messing around with something else I tried disabling bridge track from passing through iptables and that seemed to fix the problem.

sysctl setting:
net.bridge.bridge-nf-call-iptables=0

Unfortunately this isn't really an option in all cases as we (and OpenStack) rely on iptables in a number of places.

Working configuration (in addition to above):
sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 0

iptables-save
# Generated by iptables-save v1.4.12 on Thu Oct 4 01:31:25 2012
*filter
:INPUT ACCEPT [623:48985]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [860:94775]
COMMIT
# Completed on Thu Oct 4 01:31:25 2012

What is the value of net.bridge.bridge-nf-call-iptables in your environment?

Revision history for this message
Kodamati Pradeep Vinesh Reddy (pradeep-reddy2) wrote :

There are no problems with the drivers and looks like you are facing configuration issues. Please contact the canonical support team who might be able to help you out with the configuration issues.

Thanks.

Revision history for this message
Søren Laursen (sl-6) wrote :

Looks like a similar bug here, but on HP ProLiant DL380p Gen8:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1213887

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.