gnome-shell crashed with SIGSEGV in _cogl_boxed_value_set_x()

Bug #1715330 reported by Jean-Baptiste Lallement
42
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Mutter
Fix Released
Critical
mutter (Ubuntu)
Fix Released
High
Unassigned

Bug Description

https://errors.ubuntu.com/problem/923e1f8ff89aa3ff451c6aec260ec590152cf01a

---

Artful Desktop daily, wayland session

Test Case:
Pre-requisites: Package update that downloads a payload and download fails (ie flash plugin)
1. Wait until the update-notifier dialog informing the user about the download failure shows up
2. Click on the 'Execute' buitton
3. Enter your credentials
4. Proceed with the download

Expected result
The package is downloaded

Actual result
This crash happens when the authentication window is displayed

ProblemType: Crash
DistroRelease: Ubuntu 17.10
Package: gnome-shell 3.25.91-0ubuntu3
ProcVersionSignature: Ubuntu 4.12.0-12.13-generic 4.12.8
Uname: Linux 4.12.0-12-generic x86_64
ApportVersion: 2.20.7-0ubuntu1
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Wed Sep 6 09:41:48 2017
DisplayManager: gdm3
ExecutablePath: /usr/bin/gnome-shell
GsettingsChanges:
 b'org.gnome.shell' b'had-bluetooth-devices-setup' b'true'
 b'org.gnome.shell' b'favorite-apps' b"['org.gnome.Nautilus.desktop', 'firefox.desktop', 'google-chrome-beta.desktop', 'streamtuner2.desktop']"
 b'org.gnome.desktop.interface' b'gtk-im-module' b"'gtk-im-context-simple'"
InstallationDate: Installed on 2014-07-23 (1140 days ago)
InstallationMedia: Ubuntu 14.04.1 LTS "Trusty Tahr" - Release amd64 (20140722.2)
ProcCmdline: /usr/bin/gnome-shell
ProcEnviron:
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
SegvAnalysis:
 Segfault happened at: 0x7f72063804ef: mov 0x8(%rdi),%eax
 PC (0x7f72063804ef) ok
 source "0x8(%rdi)" (0x00000008) not located in a known VMA region (needed readable region)!
 destination "%eax" ok
SegvReason: reading NULL VMA
Signal: 11
SourcePackage: gnome-shell
StacktraceTop:
 ?? () from /usr/lib/x86_64-linux-gnu/mutter/libmutter-cogl-1.so
 ?? () from /usr/lib/x86_64-linux-gnu/mutter/libmutter-cogl-1.so
 ffi_call_unix64 () from /usr/lib/x86_64-linux-gnu/libffi.so.6
 ffi_call () from /usr/lib/x86_64-linux-gnu/libffi.so.6
 ?? () from /usr/lib/libgjs.so.0
Title: gnome-shell crashed with SIGSEGV in ffi_call_unix64()
UpgradeStatus: Upgraded to artful on 2017-06-13 (84 days ago)
UserGroups: adm dialout libvirt lpadmin lxd sambashare sudo

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
information type: Private → Public
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 _cogl_boxed_value_set_x (bv=0x0, size=1, count=1, type=type@entry=COGL_BOXED_FLOAT, value_size=4, value=0x10080d8e800, transpose=0) at cogl-boxed-value.c:141
 _cogl_boxed_value_set_float (bv=<optimized out>, n_components=<optimized out>, count=<optimized out>, value=<optimized out>) at cogl-boxed-value.c:212
 ffi_call_unix64 () at ../src/x86/unix64.S:76
 ffi_call (cif=cif@entry=0x1007f034eb8, fn=<optimized out>, rvalue=<optimized out>, rvalue@entry=0x7ffed1542b68, avalue=avalue@entry=0x7ffed1542a20) at ../src/x86/ffi64.c:525
 gjs_invoke_c_function (context=context@entry=0x1007d9a7000, function=function@entry=0x1007f034ea0, obj=..., obj@entry=..., args=..., js_rval=..., r_value=r_value@entry=0x0) at gi/function.cpp:1037

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : StacktraceSource.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in gnome-shell (Ubuntu):
importance: Undecided → Medium
summary: - gnome-shell crashed with SIGSEGV in ffi_call_unix64()
+ gnome-shell crashed with SIGSEGV in _cogl_boxed_value_set_x()
tags: removed: need-amd64-retrace
Changed in gnome-shell (Ubuntu):
importance: Medium → High
status: New → Confirmed
Revision history for this message
Adam Williamson (awilliamson) wrote :

We're seeing what's probably the same crash in Fedora 27, though my reproducer is to start a VM in virt-manager:

https://bugzilla.redhat.com/show_bug.cgi?id=1490072

It was suggested to try 3.25.92 and see if that fixes it.

Revision history for this message
Adam Williamson (awilliamson) wrote :

There's some discussion upstream too at https://bugzilla.gnome.org/show_bug.cgi?id=787240 - but note that it's a confused bug report that starts out being about a different bug, which is reported in Launchpad as https://bugs.launchpad.net/ubuntu/+source/gnome-shell/+bug/1714330 . I will file a new upstream bug for the _console_boxed_value_set_x crash.

Revision history for this message
In , Adamw-c (adamw-c) wrote :

There's some discussion of this in https://bugzilla.gnome.org/show_bug.cgi?id=787240 , but that's really about another crash, so I figure this deserves its own report.

Julian Andres Klode (in #787240), myself and Mikhail Gavrilov (in https://bugzilla.redhat.com/show_bug.cgi?id=1490072 ), and Jean-Baptiste Lallement (in https://bugs.launchpad.net/ubuntu/+source/gnome-shell/+bug/1715330 ) have all seen gnome-shell crash this way (taking down our entire sessions, natch) - a segfault in mutter _cogl_boxed_value_set_x , apparently because somehow it's getting called with NULL as its target. Myself, Julian and Mikhail saw it when opening / starting VMs in virt-manager and Boxes; the Ubuntu reproducer seems to possibly involve some kind of Ubuntu-specific update thing, but these are the cited steps:

Test Case:
Pre-requisites: Package update that downloads a payload and download fails (ie flash plugin)
1. Wait until the update-notifier dialog informing the user about the download failure shows up
2. Click on the 'Execute' buitton
3. Enter your credentials
4. Proceed with the download

Expected result
The package is downloaded

Actual result
This crash happens when the authentication window is displayed

Myself, Mikhail and Jean-Baptiste all reported from 3.25.91, but I think Julian may have tested with something more recent. I don't see any changes between 3.25.91 and 3.25.92 which really look like they ought to fix this in any case.

Revision history for this message
Adam Williamson (awilliamson) wrote :
affects: gnome-shell (Ubuntu) → mutter (Ubuntu)
Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

The bug is in gnome-shell, so moving there.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Created attachment 359569
inhibitShortcuts: Don't destroy actor when hiding

The same dialog can be shown multiple times; to not crash when that
happens we need to avoid destroying the actor when hiding, otherwise
we'll crash when running code that assumes it is still valid.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

To actually reproduce this reliably, the patch from bug 787570 is needed. On my set up, this exposes a gtk+/virt-manager bug resulting in the inhibit request being created and destroyed repeatedly.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Uh, must have slipped. Reopening.

Changed in mutter:
importance: Unknown → Critical
status: Unknown → Confirmed
Revision history for this message
In , Ofourdan (ofourdan) wrote :

The issue is that virt-manager issues a grab() when it gains keyboard focus, and an ungrab() when it loses the keyboard focus.

On Wayland, the grab()/ungrab() being wired to the shortcut inhibit mechanism, which in turn shows the dialog, therefore takes focus away from the virt-manager causes the continuous flickering.

The solution, or at least a compromise, is to keep the dialog around even if the shortcut inhibit is cancelled by the client, so that the user is forced to make a choice that we can reuse on the next request.

Revision history for this message
In , Ofourdan (ofourdan) wrote :

Created attachment 359585
[PATCH] wayland: Keep the inhibit shortcut dialog

On Wayland, the grab()/ungrab() in gtk+/gdk are wired to the shortcut
inhibitor mechanism, which in turn shows the dialog, which can take
focus away from the client window when the dialog is shown.

If the client issues an ungrab() when the keyboard focus is lost, we
would hide the dialog, causing the keyboard focus to be returned to the
client surface, which in turn would issue a new grab(), so forth and so
on, causing a continuous show/hide of the shortcut inhibitor dialog.

To avoid this issue, keep the dialog around even if the shortcut inhibit
is canceled by the client, so that the user is forced to make a choice
that we can reuse on the next request without showing the dialog again.

Instead of hiding the dialog when the shortcut inhibitor is destroyed by
the client, we simply mark the request as canceled and do not apply the
user's choice.

Revision history for this message
In , Ofourdan (ofourdan) wrote :

Created attachment 359586
[PATCH] wayland: do not leak shortcut inhibit data

We would free the shortcut inhibit data only when the client destroys
its request, which is not the case when the clients itself is
destroyed, leading to a leak of the shortcut inhibit data.

Free the data on resource destruction instead, and simply destroy the
resource on destroy request.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Review of attachment 359585:

Looks good; just one nit about readability.

::: src/wayland/meta-wayland-inhibit-shortcuts-dialog.c
@@ +85,3 @@
+ if (data->request_canceled)
+ return;
+

Would make more sense if you put this condition in the dialog-response function. This being processed from the auto-approve path makes no sense.

Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Review of attachment 359586:

LGTM.

Revision history for this message
In , Ofourdan (ofourdan) wrote :

Created attachment 359592
[PATCH v2] wayland: Keep the inhibit shortcut dialog

(In reply to Jonas Ådahl from comment #8)
> Review of attachment 359585 [details] [review]:
>
> Looks good; just one nit about readability.
>
> ::: src/wayland/meta-wayland-inhibit-shortcuts-dialog.c
> @@ +85,3 @@
> + if (data->request_canceled)
> + return;
> +
>
> Would make more sense if you put this condition in the dialog-response
> function. This being processed from the auto-approve path makes no sense.

Yeap, that also simplifies the patch...

tags: added: rls-aa-incoming
Revision history for this message
In , Jonas Ådahl (jadahl) wrote :

Review of attachment 359592:

lgtm.

Revision history for this message
In , Ofourdan (ofourdan) wrote :

Comment on attachment 359586
[PATCH] wayland: do not leak shortcut inhibit data

attachment 359586 pushed to git master as commit 2bf7974 - wayland: do not leak shortcut inhibit data

Revision history for this message
In , Ofourdan (ofourdan) wrote :

Comment on attachment 359592
[PATCH v2] wayland: Keep the inhibit shortcut dialog

attachment 359592 pushed to git master as commit 9c16e4e - wayland: Keep the inhibit shortcut dialog

Changed in mutter:
status: Confirmed → Fix Released
Iain Lane (laney)
Changed in mutter (Ubuntu):
assignee: nobody → Jean-Baptiste Lallement (jibel)
tags: removed: rls-aa-incoming
Revision history for this message
In , Ofourdan (ofourdan) wrote :

*** Bug 768959 has been marked as a duplicate of this bug. ***

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

I verified in artful with the latest mutter and cannot reproduce this defect. I'm marking it as fix released.

Changed in mutter (Ubuntu Artful):
status: Confirmed → Fix Released
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Not fixed?... It's still happening with the latest gnome-shell/mutter -> bug 1725162

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This crash continues to reoccur. See the duplicates and also:
https://errors.ubuntu.com/problem/923e1f8ff89aa3ff451c6aec260ec590152cf01a

Maybe we're linked to the wrong upstream bug and need a new one.

Changed in mutter (Ubuntu):
status: Fix Released → Confirmed
assignee: Jean-Baptiste Lallement (jibel) → nobody
Changed in mutter (Ubuntu Artful):
assignee: Jean-Baptiste Lallement (jibel) → nobody
status: Fix Released → Confirmed
description: updated
Revision history for this message
In , Daniel van Vugt (vanvugt) wrote :

Those commit NNNN links aren't working. Can someone assign this bug to "mutter" to fix that?

tags: added: bionic
Changed in mutter:
importance: Critical → Unknown
status: Fix Released → Unknown
no longer affects: mutter (Ubuntu Artful)
Changed in mutter (Ubuntu):
status: Confirmed → Fix Released
Changed in mutter:
importance: Unknown → Critical
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.