bonding balance-alb RTNL lock failure

Bug #85072 reported by Chris Boyle
2
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: linux-source-2.6.15

Found in version 2.6.15-27.50. When bonding (in balance-alb mode) tries to change interface mac addresses (e.g. when one of the slave interfaces loses link), this happens:

Feb 7 10:59:33 balvenie kernel: RTNL: assertion failed at net/ipv4/devinet.c (983)
Feb 7 10:59:33 balvenie kernel:
Feb 7 10:59:33 balvenie kernel: Call Trace: <IRQ> <ffffffff80331fe6>{inetdev_event+102} <ffffffff801401df>{notifier_call_chain+31}
Feb 7 10:59:33 balvenie kernel: <ffffffff802ecdc1>{dev_set_mac_address+97} <ffffffff8802e4a0>{:bonding:alb_set_slave_mac_addr+80}
Feb 7 10:59:33 balvenie kernel: <ffffffff8802e584>{:bonding:alb_swap_mac_addr+180} <ffffffff88026113>{:bonding:bond_mc_swap+259}
Feb 7 10:59:33 balvenie kernel: <ffffffff88026317>{:bonding:bond_change_active_slave+279}
Feb 7 10:59:33 balvenie kernel: <ffffffff880263ea>{:bonding:bond_select_active_slave+26}
Feb 7 10:59:33 balvenie kernel: <ffffffff88027796>{:bonding:bond_mii_monitor+1030} <ffffffff88027390>{:bonding:bond_mii_monitor+0}
Feb 7 10:59:33 balvenie kernel: <ffffffff8013be23>{run_timer_softirq+355} <ffffffff8013766b>{__do_softirq+107}
Feb 7 10:59:33 balvenie kernel: <ffffffff8010ef2b>{call_softirq+31} <ffffffff80110c41>{do_softirq+49}
Feb 7 10:59:33 balvenie kernel: <ffffffff8010e792>{apic_timer_interrupt+98} <EOI> <ffffffff8034c660>{thread_return+0}
Feb 7 10:59:33 balvenie kernel: <ffffffff8010bb1a>{default_idle+58} <ffffffff8010bd61>{cpu_idle+97}

There have been patches discussed on the netdev list to fix this, in the thread http://<email address hidden>/msg29385.html , but there's no indication that this has been applied to any released kernel yet, and, when backported to the bonding version in 2.6.15, this patch causes a kernel panic due to worse locking failures that I can't quite fathom ("scheduling while atomic" and an endless stream of stack traces involving bond_mii_monitor).

As far as I can tell, bonding does appear to go ahead and change the mac address even though the lock failed, but I'm worried that it might not always do so successfully under load.

Is it possible to use balance-alb reliably (without this lock failure) with a Dapper kernel?

Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote :

This bug has had no activity for a considerable period. This is a check to see if there is still interest in investigating this bug report.
Is this still an issue with later releases?

Changed in linux-source-2.6.15:
status: New → Incomplete
Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote :

We are closing this bug report because it lacks the information we need to investigate the problem, as described in the previous comments. Please reopen it if you can give us the missing information, and don't hesitate to submit bug reports in the future. To reopen the bug report you can click on the current status, under the Status column, and change the Status back to "New". Thanks again!

Changed in linux-source-2.6.15:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.