Wifi disconnect/reconnects fairly often (from multiple times an hour to multiple times per day) because of ip-config-unavailable

Bug #993571 reported by Stéphane Graber
30
This bug affects 3 people
Affects Status Importance Assigned to Milestone
network-manager (Fedora)
Fix Released
Medium
network-manager (Ubuntu)
Fix Released
Medium
Mathieu Trudel-Lapierre
Precise
Fix Released
Medium
Mathieu Trudel-Lapierre

Bug Description

[Impact]
Affects IPv6 users on dual-stack or single-ipv6-stack networks; causes frequent disconnects if a sufficient number of packets are missed (especially towards the end of the RDNSS/DNSSL entry lifetime). Patch is pretty self-contained and only affects IPv6, only in the case of SLAAC (stateless autoconfiguration) since DHCPv6-based networks provide DNS information in a different manner.

[Development Fix]
Package network-manager 0.9.4.0-0ubuntu5 which fixes all of the same bugs targetted to be fixed with this SRU; including bug 993379 and bug 988183. Also tested in a PPA (ppa:mathieu-tl/nm) prior to upload to quantal and precise-proposed.

[Stable Fix]
The small patch based on the patch provided for testing in the linked RedHat bug; which is the exact same (no changes required) patch as revised from the original (from the redhat bug and attached to this bug report) as was uploaded to Quantal after testing in a PPA. Adds a method for renewing/refreshing RDNSS and DNSSL data from Router Advertisements. At lifetime/2 a first router solicitation will be sent to try and force an update; if no response is received the same process (timeout/2) is applied again to send another solicitation message to routers asking for a RA, until one is received and refreshes RDNSS/DNSSL data or until data expiry.

[Test case]
Requires a working IPv6 setup: see below.
1) Connect to an IPv6 network that provides RDNSS data. (DNSSL uses the same procedure but is not available in current Precise kernels)
2) Observe whether the connection is stable.

[Regression Potential]
Only affects IPv6 SLAAC, which means IPv6 could be disabled to mitigate any issues encountered. Users could be affected by a (minimal) increase in the number of packets sent over the network due to the sending of Router Solicitation messages. On high-latency links this may cause issues. New RS messages may cause RAs giving new IPv6 addresses more quickly than anticipated.

---

Testing IPV6 RDNSS with radvd:
You can use a configuration similar to the following, on a router where the vlan2 interface would be the outside interface:

interface br0 {
 MinRtrAdvInterval 3;
 MaxRtrAdvInterval 10;
 AdvLinkMTU 1280;
 AdvSendAdvert on;
 prefix 0:0:0:1::/64 {
 AdvOnLink on;
 AdvAutonomous on;
 AdvValidLifetime 86400;
 AdvPreferredLifetime 86400;
 Base6to4Interface vlan2;
 };
 RDNSS 2001:503:ba3e::2:30 2001:500:2d::d {};
 };

This particular setup uses 6to4 to provide IPv6 connectivity; and announces a.root-servers.net and d.root-servers.net as DNS nameservers to use.

In control of the router one could kill the radvd daemon to simulate lost packets and observe attempts to refresh RDNSS data, and bring the daemon back up again to see how with a RA the RDNSS information gets refreshed.

Packet captures are also useful to observe the behavior.

---
Original bug report description:

To start with, let me confirm that I do NOT have any message in my kernel log complaining about the kernel not being able to set the default IPv6 route, so that's a different bug from the what you're probably thinking about ;)

This one happens every few minutes or every few hours, as far as I can tell, only on wireless networks (for a yet unknown reason) and only on dual-stack networks.

I reproduced it on a variety of equipment (3 laptops, 2 with 2 different intel wireless chips, one with atheros) and on 4 different brands of access points. Only thing in common, the network configuration is almost identical.
That's a standard dual-stack setup with IPv4 provided over DHCP and IPv6 through radvd (SLAAC) with RDNSS set.

I'm attaching a debug log. Look for "ip-config-unavailable" to spot the few occurrences of the bug in it.

This most likely is the same bug as described in: https://bugzilla.redhat.com/show_bug.cgi?id=753482

Revision history for this message
Stéphane Graber (stgraber) wrote :
Revision history for this message
Stéphane Graber (stgraber) wrote :

The assumption based on the redhat bug report is that rdnss data is sometimes expiring before the RA, triggering the bug.
I'll workaround this at home by bumpding the rdnss expiry to be 10 times longer than the RA and see if the bug still occurs throughout my network (30 or so clients in total, most of them on 12.04).

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Attaching the proposed patch on the redhat bug; I can build a package for testing if required ;)

tags: added: patch patch-forwarded-upstream
Changed in network-manager (Ubuntu):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Changed in network-manager (Ubuntu Precise):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 0.9.4.0-0ubuntu5

---------------
network-manager (0.9.4.0-0ubuntu5) quantal; urgency=low

  * debian/network-manager.upstart: add "and static-network-up" to ensure the
    loopback device is really up before we start dnsmasq. (LP: #993379)
  * debian/patches/git_kernel_ipv6_default_route_77de91e.patch: avoid fighting
    with the kernel for what IPv6 default route should be set: let the kernel
    set his own, then add a new route with a different metric so that we can
    go back and remove it later. (LP: #988183)
  * debian/patches/nm-ip6-rs.patch: avoid disconnections due to RDNSS expiry,
    send a Router Sollicit to try and get new RDNSS data. (LP: #993571)
  * debian/patches/git_remove_ifpppstatsreq_6b64e4d.patch: remove the use of
    the ifpppstatsreq struct, which has been dropped in newer kernels: use
    ifreq and ppp_stats separately instead.
 -- Mathieu Trudel-Lapierre <email address hidden> Wed, 23 May 2012 15:28:36 -0400

Changed in network-manager (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Andy Whitcroft (apw) wrote :

I have been running for a day with the packages in Mathieu's testing PPA and so far I have not seen a single network disconnect whereas before I was seeing them approximatly every 10m on average. Nice.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Similar experience here, I've been running a few variants of the patch but the last one from Mathieu's PPA seems quite stable, no disconnect since I started using it yesterday afternoon.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Attaching the "final" version of the patch, if it's useful to review as part of the SRU process.

description: updated
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Stéphane, or anyone else affected,

Accepted network-manager into precise-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in network-manager (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Stéphane Graber (stgraber) wrote :

Switched to the official -proposed package now, everything is still working as expected, no disconnect so far.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Andy Whitcroft (apw) wrote :

Also confirming a switch to the -proposed package, everything still working on my network.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 0.9.4.0-0ubuntu4.1

---------------
network-manager (0.9.4.0-0ubuntu4.1) precise-proposed; urgency=low

  * debian/network-manager.upstart: add "and static-network-up" to ensure the
    loopback device is really up before we start dnsmasq. (LP: #993379)
  * debian/patches/git_kernel_ipv6_default_route_77de91e.patch: avoid fighting
    with the kernel for what IPv6 default route should be set: let the kernel
    set his own, then add a new route with a different metric so that we can
    go back and remove it later. (LP: #988183)
  * debian/patches/nm-ip6-rs.patch: avoid disconnections due to RDNSS expiry,
    send a Router Sollicit to try and get new RDNSS data. (LP: #993571)
 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 24 May 2012 20:39:12 -0400

Changed in network-manager (Ubuntu Precise):
status: Fix Committed → Fix Released
Changed in network-manager (Fedora):
importance: Unknown → Medium
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.