[SRU] Xorg crashes when performing gesture

Bug #670016 reported by Gustavo Luiz Duarte
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
xorg-server (Ubuntu)
Invalid
High
Unassigned
Maverick
Fix Released
Undecided
Unassigned
xserver-xorg-input-evdev (Ubuntu)
Invalid
High
Unassigned
Maverick
Fix Released
Undecided
Unassigned

Bug Description

Justification:
=============
Regression from Ubuntu 10.04, causing an Xserver crash after performing a gesture. The X crash risks losing user data from unsaved documents in applications.

Fix:
===
The cause is a call to WriteToClient in signal context during input event processing. The call was added for gesture event handling, but it appears that something in WriteToClient is unsafe in signal context. Patches have been added to xserver-xorg-input-evdev and xorg-server to queue gesture events in the X server and propagate them to clients in normal context.

Note that the fix is not proposed for Natty because the gesture stack will be removed from Natty and moved to the X client side. The Natty gesture code will likely be removed in the next few weeks during the transition to X server 1.10.

Test case:
=========
Perform multitouch gestures in a gesture-enabled environment. The Unity window manager should suffice. However, this bug is indeterministic. Some users may never see the bug, others may see it multiple times per day.

Regression Potential:
====================
Moderate due to the changes to two interrelated packages and the size of the patches. A depends has been added to xserver-xorg-input-evdev to ensure it is installed with the correlated xorg-server packages. An X crash at the beginning of a gesture would occur if the packages are installed improperly. However, A newer X server with an old evdev input module would not pose any regression potential due to interdependencies.

The size of the patches may cause some concern, but they are the minimal amount required to address the bug in a maintainable manner.

Original Report:
===============
Binary package hint: xorg

I have X crashes a couple of times a day.
Could not reproduce it deterministically.

ProblemType: Bug
DistroRelease: Ubuntu 10.10
Package: xorg 1:7.5+6ubuntu3
ProcVersionSignature: Ubuntu 2.6.35-23.36-generic-pae 2.6.35.7
Uname: Linux 2.6.35-23-generic-pae i686
NonfreeKernelModules: nvidia wl
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86 Kernel Module 260.19.06 Mon Sep 13 06:35:06 PDT 2010
 GCC version: gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5)
Architecture: i386
Date: Tue Nov 2 14:25:08 2010
InstallationMedia: Ubuntu 10.04.1 LTS "Lucid Lynx" - Release i386 (20100816.1)
MachineType: Apple Inc. MacBook5,1
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.35-23-generic-pae root=UUID=14caef4c-d30c-4e34-8e4f-94a6ce5fc47e ro quiet splash
ProcEnviron:
 LANGUAGE=en
 LANG=en_US.utf8
 SHELL=/bin/bash
SourcePackage: xorg
dmi.bios.date: 04/27/09
dmi.bios.vendor: Apple Inc.
dmi.bios.version: MB51.88Z.007D.B03.0904271443
dmi.board.asset.tag: Base Board Asset Tag#
dmi.board.name: Mac-F42D89A9
dmi.board.vendor: Apple Inc.
dmi.board.version: Proto
dmi.chassis.asset.tag: Asset Tag#
dmi.chassis.type: 10
dmi.chassis.vendor: Apple Inc.
dmi.chassis.version: Mac-F42D89A9
dmi.modalias: dmi:bvnAppleInc.:bvrMB51.88Z.007D.B03.0904271443:bd04/27/09:svnAppleInc.:pnMacBook5,1:pvr1.0:rvnAppleInc.:rnMac-F42D89A9:rvrProto:cvnAppleInc.:ct10:cvrMac-F42D89A9:
dmi.product.name: MacBook5,1
dmi.product.version: 1.0
dmi.sys.vendor: Apple Inc.
system:
 distro: Ubuntu
 codename: maverick
 architecture: i686
 kernel: 2.6.35-23-generic-pae

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :
Revision history for this message
Chase Douglas (chasedouglas) wrote :

For future reference, the core dump is generated from xserver-xorg-core=2:1.9.0-0ubuntu7.

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

This is the correct core dump.
I had some problems trying to make Xorg generate core dump. I had to start Xorg manually, as describe in lp #666601.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

According to the core dump, the gesture code was doing window hit-testing, but the gesture code runs in a signal handler. If a gesture is started while the window tree is being modified the server may crash.

My related branch above includes a new patch to block the SIGIO signal while the window tree is being modified.

Revision history for this message
Chase Douglas (chasedouglas) wrote :
summary: - Xorg crashes undeterministically
+ Xorg crashes when performing gesture
Revision history for this message
Chase Douglas (chasedouglas) wrote : Re: Xorg crashes when performing gesture

Please disregard the above package in comment #6. It has an incorrect version.

description: updated
Revision history for this message
Chase Douglas (chasedouglas) wrote :
Revision history for this message
Chase Douglas (chasedouglas) wrote :
Revision history for this message
Chase Douglas (chasedouglas) wrote :
Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

I'm still having X crashes.
Attached is a core dump against xserver-xorg-core 2:1.9.0-0ubuntu8~test1

bugbot (bugbot)
affects: xorg (Ubuntu) → xorg-server (Ubuntu)
Revision history for this message
Chase Douglas (chasedouglas) wrote :

The backtrace from the core dump shows:

#0 0xb75bcb69 in ?? () from /lib/libc.so.6
#1 0xb75be682 in calloc () from /lib/libc.so.6
#2 0x080abb2c in AllocateOutputBuffer (who=0x978ae48, count=32, __buf=0xbfda255c) at ../../os/io.c:1044
#3 WriteToClient (who=0x978ae48, count=32, __buf=0xbfda255c) at ../../os/io.c:758
#4 0x0806515f in WriteEventsToClient (pClient=0x978ae48, count=1, events=0xbfda255c) at ../../dix/events.c:5705
#5 0x08067b72 in TryClientEvents (client=0x978ae48, dev=0xbfda21e8, pEvents=0xbfda255c, count=1, mask=4325376, filter=131072, grab=0x0)
    at ../../dix/events.c:1953
#6 0x0806ab60 in DeliverEventsToWindow (pDev=0xbfda21e8, pWin=0x98cae40, pEvents=0xbfda255c, count=1, filter=131072, grab=0x0) at ../../dix/events.c:2053
#7 0x0806b728 in DeliverEvents (pWin=0x98cae40, xE=0xbfda255c, count=1, otherParent=0x0) at ../../dix/events.c:2530
#8 0x080987ab in DeleteWindow (value=0x98cae40, wid=88080385) at ../../dix/window.c:943
#9 0x080711ad in FreeClientResources (client=0x9c07a60) at ../../dix/resource.c:859
#10 0x08072bc9 in CloseDownClient (client=0x9c07a60) at ../../dix/dispatch.c:3490
#11 0x08077b65 in Dispatch () at ../../dix/dispatch.c:442
#12 0x080625ba in main (argc=4, argv=0xbfda2794, envp=0xbfda27a8) at ../../dix/main.c:291

The odd thing here is that it segfaulted from AllocateOutputBuffer. The function takes no arguments (it's inlined, so it has args in the backtrace, but it doesn't really) and returns a pointer to freshly allocated data. I also added debugging symbols for calloc from libc6-dbg=2.12.1-0ubuntu10 and everything looks sane in terms of the stack. I think it's unlikely this is a stack corruption based on this data. However, it's even more unlikely, imo, that this is a bug in eglibc. Unfortunately, it's hard to tell what is going on through remote debugging.

Gustavo, can you try to reproduce again and see if the backtrace is the same? If so, it would point the finger towards eglibc. If not, then there really may be some weird stack corruption going on. Maybe something odd is occurring due to input handling in a signal handler?

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

I've linked this bug with two branches with a workaround that has been working for me.

The code in these branches are definitely not suitable for production. I did it in a quick and dirty way as a proof of concept. There are many clear improvements still to be done.

I didn't find exactly where is the bug. My suspects are we are doing something that shouldn't be done from a signal handler, although I don't know exactly what. So I moved the code that sends gestures to the client from the signal handler to the xserver main loop, passing the gestures through the event queue. As it has been working, probably the WriteToClient function is not (or uses a function that is not) async-signal-safe.

Revision history for this message
Henrik Rydberg (rydberg) wrote :

Gustavo, thank you very much for this! Looks very sane. We will continue to process this to see what can be done for Maverick. As you are probably aware, this problem will go away automatically in Natty. Thanks again, very nice.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

Gustavo,

Awesome work! How long have you been using this code, presumably without any crashes? If things are working better, I think we'll probably try to use your work. I think the server patch looks good, and the evdev patch just needs to remove the unused code.

Thanks a bunch!

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

I've been using this code for a week without any crashes.
The crash frequency before this code was around 2 crashes a day.

I will remove the unused code as suggested by Chase. Any other suggestion is welcome.
I know this code is not relevant for Natty, but if you guys intend to deliver it as an update for Maverick I'd gladly help with whatever is needed.

Thank you guys for your help on doing this. I'm very happy now using gestures without X crashes :-D

Revision history for this message
Chase Douglas (chasedouglas) wrote :

Gustavo,

I began reviewing your work more closely to package it up for testing. I realized that there are areas that are a little rough, as you noted. I am in the process of cleaning it up.

As such, don't worry about updating your branches. I'll be proposing for Maverick slightly different patches based on your work.

Thanks again!

Revision history for this message
Chase Douglas (chasedouglas) wrote :

I've updated my branches and pushed new packages to ppa:utouch-team/utouch. Gustavo, could you test the packages out to ensure they work?

Thanks

Changed in xorg-server (Ubuntu):
importance: Undecided → High
Changed in xserver-xorg-input-evdev (Ubuntu):
importance: Undecided → High
Changed in xorg-server (Ubuntu):
status: New → In Progress
Changed in xserver-xorg-input-evdev (Ubuntu):
status: New → In Progress
description: updated
Revision history for this message
Chase Douglas (chasedouglas) wrote :

Bug 687662 is preventing me from getting the xorg-server source package uploaded to the ppa :(. I've been told it should be fixed in a day or two, when I will retry the upload. Until then, industrious users could try building from the source branch I've linked to this bug :).

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

Chase,

I've just installed your packages (building from branch) and they are working fine. I'll report immediately if I have any crash.

The code is also great. I would just add that there is no need for including gestureproto.h from xf86Xinput.c and that evdev should be updated to depend of xserver-xorg-core (>> 2:1.9.0-0ubuntu8).

Thank you!

Revision history for this message
Chase Douglas (chasedouglas) wrote :

After discussion with Bryce and others on #ubuntu-devel, I've rebased the xorg-server branch on top of the package in maverick-proposed. I also fixed the xserver-xorg-input-evdev depends for the correct xorg-server version.

description: updated
description: updated
tags: added: regression-release
Revision history for this message
Martin Pitt (pitti) wrote :

Please fix this in natty first. Also, it seems that something went wrong with building the maverick-proposed version:
http://launchpadlibrarian.net/60607373/xorg-server_2%3A1.9.0-0ubuntu7.1_2%3A1.9.0-0ubuntu7.2.diff.gz

Revision history for this message
Martin Pitt (pitti) wrote :

Holding this back for now, as the current maverick-proposed version of xorg-server is v-failed (bug 650539) due to a regression (bug 680811).

Also, I need to reject this again, as the new upload again introduces a huge and unrelated diff to upstream files, as I already pointed out in the previous comment. Please check your package with debdiff before uploading.

Revision history for this message
Chase Douglas (chasedouglas) wrote :

I'll need to follow up with the ubuntu-x team for most of these issues, but this issue will not be fixed in Natty because the code affected will be removed in a few weeks.

Revision history for this message
Bryce Harrington (bryce) wrote :

I've re-uploaded the package after doublechecking the debdiff. I think the stray files were my fault when I checked out cnd's branch into my working tree rather than a clean one.

Revision history for this message
Martin Pitt (pitti) wrote :

Diff looks good now, thanks. Still holding back, see comment 23. Should we reject the current maverick-proposed version if this is urgent?

Revision history for this message
Bryce Harrington (bryce) wrote :

> Diff looks good now, thanks. Still holding back, see comment 23. Should
> we reject the current maverick-proposed version if this is urgent?

Given that this has some regression potential I think it needs to live in -proposed for a bit to get adequately tested.

Regarding the current maverick-proposed version, the only known regression seems to be fairly modest (behavioral change in how maximization works), and the issue that it fixes (x server crashes) seems to be pretty severe, and is getting confirmed by a number of people. The behavior regression has only been reported by one person, who isn't running stock packages (although he says it happens there too, but solid evidence is missing).

I'd be tempted to allow the current proposal through. The benefit seems to far outweigh the risk in this case. However, if you're not comfortable with that, then yeah reject it out of proposed and put this one in instead. We can then sharpen our pencils on seeing if the alleged regression can be independently confirmed.

Revision history for this message
Martin Pitt (pitti) wrote :

There are now two xorg-server uploads in the proposed queue, for bug 575465 and bug 670016. Please either reupload 7.3 with an appropriate -v to include the two proposed versions, or better yet, just upload a 2:1.9.0-0ubuntu7.2 which has both changes? Also, there's still 7.1 in proposed which causes the regression in bug 680811. If we stack two more uploads on top of that, it'll quickly get unmanageable.

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 670016] Re: Xorg crashes when performing gesture

On Fri, Jan 07, 2011 at 08:57:23AM -0000, Martin Pitt wrote:
> There are now two xorg-server uploads in the proposed queue, for bug
> 575465 and bug 670016. Please either reupload 7.3 with an appropriate -v
> to include the two proposed versions, or better yet, just upload a
> 2:1.9.0-0ubuntu7.2 which has both changes?

Done

> Also, there's still 7.1 in
> proposed which causes the regression in bug 680811. If we stack two more
> uploads on top of that, it'll quickly get unmanageable.

Read my previous comment...
https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/670016/comments/27

The 7.1 change looks fine to me despite the alleged regression.
Either push it through or kick it out, but lets not bottleneck xserver
SRUs on it.

Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted xorg-server into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in xorg-server (Ubuntu Maverick):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
Bryce Harrington (bryce) wrote :

As per the bug description the patch being proposed is for maverick only, there isn't really an associated task for natty. (For natty, once xserver 1.10 goes in, there'll be entirely new code to provide the feature so this patch isn't relevant there.) Closing the natty task, leaving a maverick task.

summary: - Xorg crashes when performing gesture
+ [SRU] Xorg crashes when performing gesture
Changed in xserver-xorg-input-evdev (Ubuntu):
status: In Progress → Invalid
Revision history for this message
Bryce Harrington (bryce) wrote :

Uploaded

Changed in xserver-xorg-input-evdev (Ubuntu Maverick):
status: New → Fix Committed
Revision history for this message
Martin Pitt (pitti) wrote :

Accepted xserver-xorg-input-evdev into maverick-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Revision history for this message
Mathieu Marquer (slasher-fun) wrote :

Not sure whether this is because of this patch, but since the last version of xserver-xorg-input-evdev in -proposed, left click will sometimes (about 2% of clicks maybe) act as a middle click (meaning a text will be pasted when selecting a textarea, or a tab will be closed in a browser when selecting it)...

Revision history for this message
Mathieu Marquer (slasher-fun) wrote :

Still there after reverting to the previous version, so that's not related to this patch. Still looking for what's causing this bug anyway.

Revision history for this message
Gustavo Luiz Duarte (gustavold) wrote :

I've been using the packages uploaded to maverick-proposed and they work great. I will report here if I have any issues.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xorg-server - 2:1.9.0-0ubuntu7.3

---------------
xorg-server (2:1.9.0-0ubuntu7.3) maverick-proposed; urgency=low

  * debian/patches/209_xorg-xi-usb-wireless-mouse-kb.patch:
    - Backport of upstream patch which swaps order of detection of a
      device as pointer or keyboard. Fixes certain mouse models that
      get detected as keyboards.
      (LP: #575465)

xorg-server (2:1.9.0-0ubuntu7.2) maverick-proposed; urgency=low

  * Move gesture event handling to server in non-signal context
    (LP: #670016)
    - Add debian/patches/208_gestures_through_event_queue.patch
 -- Bryce Harrington <email address hidden> Thu, 06 Jan 2011 15:07:12 -0800

Changed in xorg-server (Ubuntu Maverick):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xserver-xorg-input-evdev - 1:2.3.2-6ubuntu3.1

---------------
xserver-xorg-input-evdev (1:2.3.2-6ubuntu3.1) maverick-proposed; urgency=low

  * Add patch from Gustavo Luiz Duarte to enqueue gesture events for server-side
    processing.
    - Fixes X crash due to sending events to clients in signal context
      (LP: #670016)
 -- Chase Douglas <email address hidden> Mon, 06 Dec 2010 13:34:49 -0800

Changed in xserver-xorg-input-evdev (Ubuntu Maverick):
status: Fix Committed → Fix Released
Changed in xorg-server (Ubuntu):
status: In Progress → Invalid
To post a comment you must log in.