Unity8 completely frozen (unable to unlock, receive calls, etc)

Bug #1417773 reported by Ricardo Salveti
This bug report is a duplicate of:  Bug #1421009: unity8 sometimes hangs on boot. Edit Remove
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical System Image
New
Undecided
Unassigned
Mir
Invalid
Critical
Unassigned
mir (Ubuntu)
Invalid
Undecided
Unassigned
unity8 (Ubuntu RTM)
New
Undecided
Unassigned

Bug Description

phablet@ubuntu-phablet:~$ system-image-cli -i
current build number: 224
device name: krillin
channel: ubuntu-touch/ubuntu-rtm/14.09-proposed
last update: 2015-01-29 18:53:09
version version: 224
version ubuntu: 20150129
version device: 20150129-c75dcfb
version custom: 20150129-528-26-182

Was driving around with my car, and noticed once I got back home that my clock was stuck in a previous time (~20min before getting home), and after trying to use the phone I noticed unity8 was completely stuck, but the underling OS was still running fine.

The worst here is that we're unable to notify the user about calls, as the interface is completely stuck. Another bad thing is that this could as well happen when the user is sleeping or something, without an easy way to know until he tries to use the phone.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :
Download full text (5.2 KiB)

Looking at my syslog at the exact minute that got stuck (time presented by the greeter), I wasn't really using the phone (it was trying to suspend), and I also noticed the GPS was reporting issues, not sure if related or not though:
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.707790][pm_notifier_call_chain]: there are 4 notify callbacks, event = 4
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.707853]msdc: 0,mmc_schedule_delayed_work ret= 1
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.707918]msdc: 1,mmc_schedule_delayed_work ret= 1
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.709977]PM: suspend exit 2015-02-03 20:27:08.594582999 UTC
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.710031]PM: suspend entry 2015-02-03 20:27:08.594646769 UTC
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.710042]PM: Syncing filesystems ... done.
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.747749][pm_notifier_call_chain]: there are 4 notify callbacks, event = 3
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.747780]Freezing user space processes ... [GPS] GPS_read: signaled by -ERESTARTSYS(-512)
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748746]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748831]last active wakeup source: eventpoll
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748843]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748856]Freezing of tasks aborted after 0.001 seconds (38 tasks refusing to freeze, wq_busy=0):
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748870]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.748876]Restarting tasks ... done.
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.757297][pm_notifier_call_chain]: there are 4 notify callbacks, event = 4
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.757363]msdc: 0,mmc_schedule_delayed_work ret= 1
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.757431]msdc: 1,mmc_schedule_delayed_work ret= 1
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.759542]PM: suspend exit 2015-02-03 20:27:08.644146461 UTC
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.759598]PM: suspend entry 2015-02-03 20:27:08.644212999 UTC
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.759609]PM: Syncing filesystems ... done.
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.802887][pm_notifier_call_chain]: there are 4 notify callbacks, event = 3
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.802918]Freezing user space processes ... [GPS] GPS_read: signaled by -ERESTARTSYS(-512)
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804001]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804230]last active wakeup source: eventpoll
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804243]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804256]Freezing of tasks aborted after 0.001 seconds (38 tasks refusing to freeze, wq_busy=0):
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804270]
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.804276]Restarting tasks ... done.
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.811308][pm_notifier_call_chain]: there are 4 notify callbacks, event = 4
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.811369]msdc: 0,mmc_schedule_delayed_work ret= 1
Feb 3 18:27:08 ubuntu-phablet kernel: [43408.811433]msdc: 1,mmc_schedule_delayed_work ret= ...

Read more...

description: updated
Revision history for this message
Ricardo Salveti (rsalveti) wrote :
Download full text (7.0 KiB)

The stacktrace:

#0 __libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:43
No locals.
#1 0xb60021e4 in __lll_lock_wait (futex=futex@entry=0xc91d18, private=<optimized out>) at ../ports/sysdeps/unix/sysv/linux/arm/nptl/lowlevellock.c:46
        _a2 = 128
        _a3 = 2
        _a1 = 13180184
        _a1tmp = 13180184
        _a4 = 0
        _name = 240
        __ret = 13180184
        oldval = <optimized out>
#2 0xb5ffe9a6 in __GI___pthread_mutex_lock (mutex=0xc91d18) at pthread_mutex_lock.c:114
        __futex = 0xc91d18
        id = <optimized out>
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        type = <optimized out>
        id = <optimized out>
#3 0xb387c532 in _dbus_connection_lock (connection=0xc91c80) at ../../dbus/dbus-connection.c:397
No locals.
#4 0xb3889cfe in _dbus_pending_call_get_connection_and_lock (pending=pending@entry=0x9d6b3bc0) at ../../dbus/dbus-pending-call.c:334
No locals.
#5 0xb387efcc in reply_handler_timeout (data=0x9d6b3bc0) at ../../dbus/dbus-connection.c:3292
        connection = <optimized out>
        status = <optimized out>
        pending = 0x9d6b3bc0
#6 0xb3c9b28e in q_dbus_timeout_handle (timeout=<optimized out>) at qdbus_symbols_p.h:174
No locals.
#7 QDBusConnectionPrivate::timerEvent (this=0xc907f0, e=0xbed6f0f4) at qdbusintegrator.cpp:1134
        locker = {<QDBusMutexLocker> = {<QDBusLockerBase> = {<No data fields>}, self = 0xc907f0, mutex = 0xc9081c, action = TimerEventAction}, <No data fields>}
#8 0xb64f9ef2 in QObject::event (this=0xc907f0, e=0xbed6f0f4) at kernel/qobject.cpp:1263
No locals.
#9 0xb64d9318 in QCoreApplication::notify (this=<optimized out>, receiver=<optimized out>, event=<optimized out>) at kernel/qcoreapplication.cpp:997
        d = <optimized out>
#10 0xb64d93f4 in QCoreApplication::notifyInternal (this=0xbd4b88, receiver=0xc907f0, event=event@entry=0xbed6f0f4) at kernel/qcoreapplication.cpp:935
        threadData = 0xbd4c30
        loopLevelCounter = {threadData = 0xbd4c30}
        result = false
        cbdata = {0xc907f0, 0xbed6f0f4, 0xbed6f08f}
        d = <optimized out>
#11 0xb6513c2e in sendEvent (event=0xbed6f0f4, receiver=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:237
No locals.
#12 QTimerInfoList::activateTimers (this=0xbf463c) at kernel/qtimerinfo_unix.cpp:643
        e = {<QEvent> = {_vptr.QEvent = 0xb668ff38 <vtable for QTimerEvent+8>, static staticMetaObject = {d = {superdata = 0x0, stringdata = 0xb6653118 <qt_meta_stringdata_QEvent>, data = 0xb6652b94 <qt_meta_data_QEvent>, static_metacall = 0x0, relatedMetaObjects = 0x0,
                extradata = 0x0}}, d = 0x0, t = 1, posted = 0, spont = 0, m_accept = 1, reserved = 0}, id = 8}
        currentTimerInfo = 0x17192a8
        n_act = 3
        maxCount = <optimized out>
        currentTime = {tv_sec = 43455, tv_nsec = <optimized out>}
#13 0xb6513f64 in timerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:185
        timerSource = <optimized out>
#14 idleTimerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:232
        timerSource = <optimized out>
#15 0xb5ce1f50 in g_main_d...

Read more...

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Stacktrace looks similar with the one reported by bug 1415141 (but in that case just the dash was stuck).

Revision history for this message
Michał Sawicz (saviq) wrote :

Here's a "t a a bt" out of a frozen unity8 instance. Not seeing an entry similar to yours, suspect are threads 25 and 2... Although a bunch of others look blocked, that might just be a result, not a cause.

Revision history for this message
Michał Sawicz (saviq) wrote :

Got a lockup on boot, here's a another gdb "t a a bt".

Revision history for this message
Michał Sawicz (saviq) wrote :

<duflu> Fun. Looks like some kind of deadlock in Mir's GlibMainLoop with Mir's framedropping logic

Changed in mir:
status: New → Triaged
importance: Undecided → Critical
milestone: none → 0.12.0
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I think we might have two unrelated deadlocks here.

rsalveti's is a dbus deadlock.
saviq's is a Mir deadlock.

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> <duflu> Fun. Looks like some kind of deadlock in Mir's GlibMainLoop with Mir's framedropping logic

I see no indication of this in the backtraces.

> I think we might have two unrelated deadlocks here.
>
> rsalveti's is a dbus deadlock.
> saviq's is a Mir deadlock.

Not sure if Saviq's issue is a pure Mir deadlock. In Saviq's backtrace, threads 1 and 11(both Qt related) and 15 (Mir waiting for a reply) seem suspicious.

Revision history for this message
kevin gunn (kgunn72) wrote :

spawing Mir bug 1421255 for the vivid findings

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> <duflu> Fun. Looks like some kind of deadlock in Mir's GlibMainLoop with Mir's framedropping logic
>
> I see no indication of this in the backtraces.

Looking again, this could indeed be happening (in Saviq's traces).

Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Looking again, this could indeed be happening (in Saviq's traces).

Thanks Alberto ;)

Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

Spawned another Mir bug for Saviq's second backtrace as it looks unrelated to the first one.

https://bugs.launchpad.net/mir/+bug/1421308

Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

rsalveti's original bug doesn't seem to be related to Mir. Marking it as Invalid for Mir.

Changed in mir:
status: Triaged → Invalid
Changed in mir (Ubuntu):
status: New → Invalid
Changed in mir:
milestone: 0.12.0 → none
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Got the same crash, output of 't a a bt' attached.

current build number: 234
device name: krillin
channel: ubuntu-touch/ubuntu-rtm/14.09-proposed
last update: 2015-02-14 00:43:00
version version: 234
version ubuntu: 20150211.1
version device: 20150211-74c2df2
version custom: 20150207-538-29-183

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Did we just split this bug into two; bug 1421255, bug 1421308
and leave this one representing nothing? :)

Is this now a duplicate?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

OK then, based on comment #14 we have three different deadlocks now :)

This one is roughly:
  (hung in some dbus calls)
from:
  QNetworkManagerInterfaceDeviceWireless::accessPointAdded

Revision history for this message
Alberto Aguirre (albaguirre) wrote :

Right Thread#1 and Thread#15 seem deadlock. There must be at least a couple of different mutexes used by the Qt DBus abstraction that are causing this but have not inspected the code fully yet.

Revision history for this message
Michał Sawicz (saviq) wrote :

OK the underlying cause for all of those seems to be a Qt dbus bug, marking dupe of one that has the most complete trace of all threads.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.