whoopsie takes 100% CPU on the phone

Bug #1211417 reported by Omer Akram
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Whoopsie
Fix Released
Undecided
Unassigned
touch-preview-images
Fix Released
High
Unassigned
whoopsie (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

device: mako
Whoopsie 0.2.20

I just noted whoopise taking 100% cpu on two of the devices in the lab, its quite critical since it may kill our phones.

Tags: qa-touch
Revision history for this message
Paul Larson (pwlars) wrote :

I'm seeing this locally on my mako as well, with the 20130812 image. Strace has these messages being repeated very rapidly in a loop:
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}], 3, 6798485) = 1 ([{fd=3, revents=POLLNVAL}])
clock_gettime(CLOCK_MONOTONIC, {7622, 516167266}) = 0
clock_gettime(CLOCK_MONOTONIC, {7622, 516380909}) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}], 3, 6798484) = 1 ([{fd=3, revents=POLLNVAL}])
clock_gettime(CLOCK_MONOTONIC, {7622, 516869234}) = 0
clock_gettime(CLOCK_MONOTONIC, {7622, 517082877}) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=5, events=POLLIN}], 3, 6798483) = 1 ([{fd=3, revents=POLLNVAL}])
clock_gettime(CLOCK_MONOTONIC, {7622, 517571203}) = 0
clock_gettime(CLOCK_MONOTONIC, {7622, 517754325}) = 0

tags: added: qa-touch
Omer Akram (om26er)
description: updated
Changed in touch-preview-images:
importance: Undecided → High
Revision history for this message
Paul Larson (pwlars) wrote :

Whoopsie 0.2.20

Omer Akram (om26er)
Changed in whoopsie (Ubuntu):
importance: Undecided → Critical
status: New → Confirmed
Omer Akram (om26er)
description: updated
Revision history for this message
Evan (ev) wrote :

We've just landed a new glib. Can you check to see if you have libglib2.0-0 2.37.5-1ubuntu1 installed, and whether downgrading to a version prior fixes the issue? Whoopsie isn't polling fds itself, so I suspect the problem lies a level below.

I've been unable to trigger this on a Nexus 4 so far. If anyone has steps to reproduce the problem, I'd welcome them :)

Changed in whoopsie (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Evan (ev) wrote :

Also, please attach `grep whoopsie /var/log/syslog` and the resultant /var/crash/*whoopsie*.crash file when you pkill -ABRT whoopsie.

Thanks!

Revision history for this message
Alan Pope 🍺🐧🐱 🦄 (popey) wrote :

Getting this on my mako too.

top - 12:45:24 up 1 min, 0 users, load average: 2.36, 1.08, 0.41
Tasks: 201 total, 3 running, 198 sleeping, 0 stopped, 0 zombie
%Cpu(s): 13.7 us, 12.3 sy, 0.0 ni, 74.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 1916236 total, 498360 used, 1417876 free, 19144 buffers
KiB Swap: 102396 total, 0 used, 102396 free, 232960 cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 1277 whoopsie 20 0 39616 3768 2784 R 99.6 0.2 1:38.30 whoopsie
 1870 root 20 0 2484 1084 708 R 1.9 0.1 0:00.18 top

root@ubuntu-phablet:/# grep whoopsie /var/log/syslog
Aug 13 09:51:11 ubuntu-phablet whoopsie[1360]: whoopsie 0.2.20 starting up.
Aug 13 09:51:11 ubuntu-phablet whoopsie[1360]: Using lock path: /var/lock/whoopsie/lock
Aug 13 09:51:11 ubuntu-phablet whoopsie[1382]: offline
Aug 13 10:19:37 ubuntu-phablet whoopsie[1217]: whoopsie 0.2.20 starting up.
Aug 13 10:19:37 ubuntu-phablet whoopsie[1217]: Using lock path: /var/lock/whoopsie/lock
Aug 13 10:19:38 ubuntu-phablet whoopsie[1274]: offline
Aug 13 10:31:17 ubuntu-phablet kernel: [ 705.135968] init: whoopsie main process ended, respawning
Aug 13 10:31:17 ubuntu-phablet whoopsie[2454]: whoopsie 0.2.20 starting up.
Aug 13 10:31:17 ubuntu-phablet whoopsie[2454]: Using lock path: /var/lock/whoopsie/lock
Aug 13 10:31:20 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:38:35 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:39:04 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:39:07 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:59:34 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:59:59 ubuntu-phablet whoopsie[2455]: online
Aug 13 10:59:59 ubuntu-phablet whoopsie[2455]: online
Aug 13 12:43:27 ubuntu-phablet whoopsie[1251]: whoopsie 0.2.20 starting up.
Aug 13 12:43:27 ubuntu-phablet whoopsie[1251]: Using lock path: /var/lock/whoopsie/lock
Aug 13 12:43:28 ubuntu-phablet whoopsie[1277]: offline

There is no /var/crash/*whoopsie* crash

Revision history for this message
Alan Pope 🍺🐧🐱 🦄 (popey) wrote :

Running this version of libglib2.0-0 when it happened.

root@ubuntu-phablet:/# apt-cache policy libglib2.0-0
libglib2.0-0:
  Installed: 2.37.3-1ubuntu2
  Candidate: 2.37.5-1ubuntu1
  Version table:
     2.37.5-1ubuntu1 0
        500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main armhf Packages
 *** 2.37.3-1ubuntu2 0
        100 /var/lib/dpkg/status

Revision history for this message
Alan Pope 🍺🐧🐱 🦄 (popey) wrote :
Revision history for this message
Evan (ev) wrote :

Alan's crash, retraced.

Revision history for this message
Alan Pope 🍺🐧🐱 🦄 (popey) wrote :

root@ubuntu-phablet:/# ls -al /var/crash
total 2428
drwxrwsrwt 2 root whoopsie 4096 Aug 13 13:31 .
drwxr-xr-x 13 root root 4096 Aug 13 09:50 ..
-rw-rw---- 1 root whoopsie 0 Aug 13 09:51 .lock
-rw-r----- 1 root whoopsie 1679155 Aug 13 12:43 _usr_bin_powerd.0.crash
-rw-r----- 1 whoopsie whoopsie 635766 Aug 13 13:31 _usr_bin_whoopsie.106.crash
-rw-r----- 1 root whoopsie 157882 Aug 13 10:33 _usr_lib_arm-linux-gnueabihf_ubuntu-location-service_examples_client.0.crash

Revision history for this message
Evan (ev) wrote :

A better stack trace. This definitely looks like a glib problem. I'm still investigating, but I haven't been able to reproduce it since the first couple of times.

Evan (ev)
Changed in glib2.0 (Ubuntu):
importance: Undecided → Critical
James Ramsay (f-jack)
Changed in whoopsie (Ubuntu):
status: Incomplete → Confirmed
Changed in glib2.0 (Ubuntu):
status: New → Confirmed
Changed in touch-preview-images:
status: New → Confirmed
Changed in whoopsie:
status: New → Confirmed
Revision history for this message
Evan (ev) wrote :

For what it's worth, I've been unable to reproduce this at all on an up-to-date system. That could just be that this is racy, but I would've expected to see it by now.

Revision history for this message
Evan (ev) wrote :

[18:06:03] <ev> desrt: would you be able to judge whether the problem lies in glib or in whoopsie from that stack trace?
[18:06:46] <ev> (the final one, that is - it's got the clearest listing)
[18:07:04] <desrt> this stack trace is ... not legit
[18:07:15] <desrt> #4 g_main_context_dispatch (context=0x40531cd5 <g_pattern_spec_free+8>) at /build/buildd/glib2.0-2.37.5/./glib/gmain.c:3641 No locals. #5 0x4054132a in g_test_build_filename_va (file_type=<optimized out>, first_path=<optimized out>, ap=...) at /build/buildd/glib2.0-2.37.5/./glib/gtestutils.c:2938
[18:07:20] <desrt> this is not possible....
[18:07:43] <desrt> neither of those functions call each other
[18:07:47] <desrt> even indirectly
[18:07:55] <ev> https://launchpadlibrarian.net/147473611/whoopsie2.crash
[18:08:52] <ev> but yeah, perhaps we've corrupted the stack inside whoopsie as gdb hints at
[18:09:22] <desrt> i'd whip out valgrind as a first step...
[18:09:40] <ev> yeah
[18:09:44] <ev> will do in the morn'

Revision history for this message
Evan (ev) wrote :

Valgrind isn't showing anything particularly relevant. The uninitialised value error is present in all bzr revisions of whoopsie when using this version of libc (2.17-91ubuntu1 - probably earlier ones too as I just upgraded it).

Evan (ev)
no longer affects: glib2.0 (Ubuntu)
Changed in whoopsie:
status: Confirmed → Fix Released
Revision history for this message
Evan (ev) wrote :

Fixed in 0.2.23

Changed in whoopsie (Ubuntu):
status: Confirmed → Fix Released
Evan (ev)
Changed in touch-preview-images:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.