nm-applet is calling poll() at ~10Hz

Bug #934117 reported by Colin Ian King
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu Power Consumption Project
Fix Released
Undecided
Unassigned
network-manager (Ubuntu)
Fix Released
High
Mathieu Trudel-Lapierre

Bug Description

I've noticed that nm-applet is polling at ~10Hz. This seems to occur only when I am only connected to the network (ethernet or wireless). A strace of nm-applet is as follows:

recvfrom(3, 0x101e304, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}], 6, 100) = 0 (Timeout)
read(4, 0x7fff27127870, 16) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x101e304, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}], 6, 0) = 0 (Timeout)
read(4, 0x7fff27127870, 16) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x101e304, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}], 6, 100) = 0 (Timeout)
read(4, 0x7fff27127870, 16) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x101e304, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=8, events=POLLIN}, {fd=7, events=POLLIN}], 6, 0) = 0 (Timeout)

So it seems we have a poll using 100 millisecond timeout and also a 0 millisecond timeout. The 0 millisecond timeout is basically a busy poll which seems a little excessive. The 100 millisecond timeout is the root cause of the 10Hz events/second wakeup.

This is causing excessive wakeups which keeps the CPU out of deep C sleep states and hence reduces battery life. Can this be fixed?

description: updated
Revision history for this message
Martin Pitt (pitti) wrote :

Mathieu, could you please give this a quick look and check if it is easy to fix? I suppose it should only listen to D-BUS events which shouldn't poll so often, or is it also busy-polling the soft-kill files?

Changed in network-manager-applet (Ubuntu):
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

So this would be mostly related to just killswitch states or something?

There was at least one case where the menu would be updated at every signal strength change, but I've already gotten rid of that as being too expensive in the last upload or so. Otherwise it's correct that the applet normally should be doing nothing but changing based on DBus messages from the NM daemon.

Sorry, I'll need to look more into it (which I've just started), but looking at the strace output like this tells me nothing at all :)

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in network-manager-applet (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian Mertes (cmertes) wrote :

For me this is easy to reproduce:

Just do a "killall wpa_supplicant". Network Manager will reconnect successfully (and a new wpa_supplicant process will be created) but from now on nm-applet will wake up ten times per second to poll for something that doesn't seem to be there.

Revision history for this message
Colin Ian King (colin-king) wrote :

I belive the poll in question is:

(gdb) where
#0 0x00007f74414c2d40 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f7441a08136 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2 0x00007f7441a0859a in g_main_loop_run () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3 0x00000000004143a7 in main ()

I see a poll with a zero timeout, followed by a failed read on an eventfd and then a 100ms poll with a timeout:

poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN}, {fd=7, events=POLLIN}], 6, 0) = 0 (Timeout)
read(4, 0x7fffd02fb220, 16) = -1 EAGAIN (Resource temporarily unavailable)
recvfrom(3, 0x15f1ef4, 4096, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN}, {fd=7, events=POLLIN}], 6, 100) = 0 (Timeout)

The fd's are as follow:

lr-x------ 1 king king 64 Feb 27 09:04 0 -> /dev/null
l-wx------ 1 king king 64 Feb 27 09:04 1 -> /home/king/.xsession-errors
lrwx------ 1 king king 64 Feb 27 09:04 10 -> anon_inode:[eventfd]
lrwx------ 1 king king 64 Feb 27 09:04 11 -> anon_inode:[eventfd]
l-wx------ 1 king king 64 Feb 27 09:04 2 -> /home/king/.xsession-errors
lrwx------ 1 king king 64 Feb 27 09:04 3 -> socket:[13455]
lr-x------ 1 king king 64 Feb 27 09:04 4 -> anon_inode:[eventfd]
l-wx------ 1 king king 64 Feb 27 09:04 5 -> socket:[13460]
l-wx------ 1 king king 64 Feb 27 09:04 6 -> socket:[13249]
lrwx------ 1 king king 64 Feb 27 09:04 7 -> socket:[13251]
lrwx------ 1 king king 64 Feb 27 09:04 8 -> socket:[13592]
lrwx------ 1 king king 64 Feb 27 09:04 9 -> socket:[13600]

Revision history for this message
Colin Ian King (colin-king) wrote :

So on a clean install I was connected to the network by ethernet and nm-applet behaved w/o the 10Hz polling. I then associated the wireless and I can now see the 10Hz polling. Not sure if that is a useful data point.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Yeah, that and the above comment means it's probably something to do with how wifi gets updated; I'll dig again into that part.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

So I believe I managed to trace this down to the animation functions for the applet, when a device changes state. start_animation_timeout() gets called, and somehow clear_animation_timeout() never gets called to remove the 100ms timeout once the devices are done activating.

I think this is probably some issue at the NetworkManager level rather than really in nm-applet, but I believe there's something we can do in the applet as well to mitigate such an issue.

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Confirmed to be an isue in libnm-glib; reassigning to NetworkManager

affects: network-manager-applet (Ubuntu) → network-manager (Ubuntu)
Changed in network-manager (Ubuntu):
importance: Undecided → High
status: Confirmed → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package network-manager - 0.9.2.0+git201202161854.8572ecf-0ubuntu5

---------------
network-manager (0.9.2.0+git201202161854.8572ecf-0ubuntu5) precise; urgency=low

  * debian/patches/git_libnm-glib_ensure_device_state_aef4340.patch: make sure
    we also update the device state in libnm-glib when the state-changed signal
    gets fired. (LP: #934117)
  * debian/control: update Vcs-Bzr; renamed the packaging branch.
  * debian/patches/dnsmasq-dont-read-hosts.patch: avoid dnsmasq caching old
    entries from /etc/hosts because they were listed when dnsmasq was started:
    have dnsmasq not read /etc/hosts at all, the libc resolver can properly
    take care of it on its own. (LP: #943339)
 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 01 Mar 2012 09:36:08 -0500

Changed in network-manager (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Colin Ian King (colin-king) wrote :

Thanks for fixing this!

Changed in ubuntu-power-consumption:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.