Xorg crashed with SIGABRT in memcpy() via cirRefreshArea() under KVM virtual machine

Bug #1043513 reported by Till Kamppeter
60
This bug affects 7 people
Affects Status Importance Assigned to Milestone
xserver-xorg-video-cirrus (Ubuntu)
Fix Released
Medium
Maarten Lankhorst
Nominated for Quantal by Maarten Lankhorst

Bug Description

[Impact]
 * Fixes a null pointer dereference when shadowfb is out of bounds, in a similar way to other ddx drivers.

[Test Case]
 * Start virt-manager, create quantal-amd64 vm, make sure cirrus is used as video driver
 * Perform a fresh install of Quantal 64bit (I used an iso image)
   - I used default options within the install
 * log in
 * start a terminal via unity (I did that by searching for terminal in unity)
 * ctrl-alt-down (to switch virtual desktop in the VM)
  <X crashes and returns you to lightdm login>

[Regression Potential]
 * Low, changes are limited to the shadowfb code paths. Since it limits the width/height of the memcpy's performed and nothing else I either expect the bug not to be fixed, or not make it worse at least. Still I'll keep watching cirrus bug reports to see if any new ones have been introduced by the fix.

[Other Info]
 * I upstreamed the bug fix and did a new release for cirrus. Raring already has the bug fixed, no new bug reports have popped up yet about it.

[Original bug report]
No login possible on KVM-based virtual machine (with virt-manager) and network settings

Source device: Host device eth2 : macvtap
Device model: virtio
Source mode: VEPA

With source mode set to "Default" it works.

ProblemType: Crash
DistroRelease: Ubuntu 12.10
Package: xserver-xorg-core 2:1.12.99.905-0ubuntu3
ProcVersionSignature: Ubuntu 3.5.0-13.13-generic 3.5.3
Uname: Linux 3.5.0-13-generic x86_64
ApportVersion: 2.5.1-0ubuntu3
Architecture: amd64
CrashCounter: 1
CurrentDmesg:
 [ 3.809292] init: plymouth-stop pre-start process (1197) terminated with status 1
 [ 5.314446] hda-intel: Invalid position buffer, using LPIB read method instead.
 [ 9.269441] hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
Date: Wed Aug 29 22:04:57 2012
DistUpgraded: Fresh install
DistroCodename: quantal
DistroVariant: ubuntu
ExecutablePath: /usr/bin/Xorg
GraphicsCard:
 Cirrus Logic GD 5446 [1013:00b8] (prog-if 00 [VGA controller])
   Subsystem: Red Hat, Inc Device [1af4:1100]
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Release amd64 (20111012)
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Bochs Bochs
ProcCmdline: /usr/bin/X :0 -core -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-13-generic root=UUID=2258d315-c1f3-4f2c-b925-01ab5cdf448d ro quiet splash vt.handoff=7
Signal: 6
SourcePackage: xorg-server
StacktraceTop:
 ?? () from /lib/x86_64-linux-gnu/libc.so.6
 cirRefreshArea () from /usr/lib/xorg/modules/drivers/cirrus_drv.so
 ?? () from /usr/lib/xorg/modules/libshadowfb.so
 ?? ()
 ?? ()
Title: Xorg crashed with SIGABRT in cirRefreshArea()
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 01/01/2007
dmi.bios.vendor: Bochs
dmi.bios.version: Bochs
dmi.chassis.type: 1
dmi.chassis.vendor: Bochs
dmi.modalias: dmi:bvnBochs:bvrBochs:bd01/01/2007:svnBochs:pnBochs:pvr:cvnBochs:ct1:cvr:
dmi.product.name: Bochs
dmi.sys.vendor: Bochs
version.compiz: compiz 1:0.9.8+bzr3319-0ubuntu3
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.38-0ubuntu2
version.libgl1-mesa-dri: libgl1-mesa-dri 9.0~git20120821.c1114c61-0ubuntu2
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.0~git20120821.c1114c61-0ubuntu2
version.xserver-xorg-core: xserver-xorg-core 2:1.12.99.905-0ubuntu3
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.99.99~git20120713.6ef1ad6a-0ubuntu1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.20.3-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.1-4~ubuntu1

Revision history for this message
Till Kamppeter (till-kamppeter) wrote :
visibility: private → public
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 memcpy (__len=3, __src=0x7fa49158eee4, __dest=0x7fa490152ed4) at /usr/include/x86_64-linux-gnu/bits/string3.h:52
 cirRefreshArea (pScrn=<optimized out>, num=<optimized out>, pbox=0x7fff66a939a0) at ../../src/cir_shadow.c:36
 ShadowCopyArea (pSrc=0x7fa496d29c20, pDst=0x7fa496cc0740, pGC=0x7fa496c1a9a0, srcx=<optimized out>, srcy=<optimized out>, width=<optimized out>, height=1, dstx=0, dsty=0) at ../../../../hw/xfree86/shadowfb/shadow.c:618
 ProcCopyArea (client=0x7fa4965680f0) at ../../dix/dispatch.c:1622
 Dispatch () at ../../dix/dispatch.c:428

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : StacktraceSource.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in xorg-server (Ubuntu):
importance: Undecided → Medium
summary: - Xorg crashed with SIGABRT in cirRefreshArea()
+ Xorg crashed with SIGABRT in memcpy()
tags: removed: need-amd64-retrace
Bryce Harrington (bryce)
summary: - Xorg crashed with SIGABRT in memcpy()
+ Xorg crashed with SIGABRT in memcpy() via cirRefreshArea()
summary: - Xorg crashed with SIGABRT in memcpy() via cirRefreshArea()
+ Xorg crashed with SIGABRT in memcpy() via cirRefreshArea() under KVM
+ virtual machine
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xorg-server (Ubuntu):
status: New → Confirmed
Revision history for this message
Paul Larson (pwlars) wrote :

Booting freshly installed quantal beta1 amd64 desktop under kvm (20120903.4)
As soon as I login, it kicks me back out to the graphical login. This was seen on the boot right after installation. If I manually reboot the system after that (i.e. shutdown kvm, properly start it back up without the -cdrom image specified), I am able to login.

Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/1043513

tags: added: iso-testing
Revision history for this message
Stefan Bader (smb) wrote :

The X crash happens when using the cirrus X driver. For KVM there is a new modeset driver in Quantal which should avoid this but there are currently races which cause that not to load properly (see bug #1038055). However the same problem hits Xen HVM guests and for those (and likely all real hw using that chip, if such still exist) it would be good to fix the cirrus X driver to work with the software emulated acceleration. Previously we would run unity2d and be ok.

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Looks like 1048304 is another dupe of this?
1056511 looks like a very similar backtrace to me as well, even though the driver at the last stage is nouveau rather than cirrus, the rest of it looks the same.

As of todays Quantal update it's stable enough to allow me to login and get a few minutes of useage before failing.

Dave

Revision history for this message
Stefan Bader (smb) wrote :

Things also became better with the Cirrus driver and VMs. After being able to login I noticed that my KVM VM (which runs on a rather power efficient, iow not too fast, machine) still has occasional crashes of this type. And there you can see (and have to wait) every graphical "goodness" like fading in and out or docking. While the host I run the Xen VM (which uses the same cirrus emulation) runs rather smoothly (because it has more CPU power) and I think I did not see any crash, yet.
That somehow leads me to believe this could be a problem (in the cirrus case) of the communication between X driver, llvm-pipe and the emulated card being overrun in some way.
Not sure how this relates to the nouveau case. I would not think the X driver could be too fast for the 3D pipe in hardware. But maybe this gives X driver people a hint.

Timo Aaltonen (tjaalton)
affects: xorg-server (Ubuntu) → xserver-xorg-video-cirrus (Ubuntu)
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Timo: As mentioned in #10 It's worth checking that bug 1056511 as well which is nouveau based, the backtrace looks similar enough to me to make me wonder if it's not driver specific.

Revision history for this message
Bryce Harrington (bryce) wrote :

Bug 1056511 has a proposed fix posted to it; if you suspect this might be a dupe of that bug then it would be worthwhile to test that patch.

Bug 1053702 (and bug 1045845) is another recent cirrus crash that was fixed about a month ago. If you've not reproduced the crash in October it is possible that was the cause of the problems.

In either case, please set this as a dupe accordingly. If the crash still occurs with latest quantal please post a fresh backtrace - see http://wiki.ubuntu.com/X/Backtracing for guidance.

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hi Bryce,
  See backtrace attached; just generated, on an up to date Quantal:

ii xserver-xorg-video-cirrus 1:1.5.1-0ubuntu2 amd64 X.Org X server -- Cirrus display driver

Looks the same one to me; it's a lot harder to hit this than it was a few weeks ago - this survived an hour of web browsing (mostly lp traiging) last night, but managed to trigger in a few minutes this morning - so it's rather difficult to know when we actually kill this bug. (kvm guest on quantal host)

Dave

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

actually, I seem to have found a reasonably repeatable sequence:
  1) Login
  2) open a gnome-terminal
  3) Walk through the 4 workspaces with ctrl-alt-arrows, within 2 or 3 times of doing that reasonably quickly it blows up.

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Some more diags;

(gdb) p *pbox
$2 = {x1 = -958, y1 = -716, x2 = -236, y2 = -282}

I'm not sure what the space these values are working is supposed to be; if they're supposed to be -ve then they're in a sensible
range as far as I can tell - but are they supposed to be -ve?

(gdb) p pPriv->pScrn->virtualX
$7 = 1024
(gdb) p pPriv->pScrn->virtualY
$8 = 768

OK - seems right

(gdb) p src
$33 = (unsigned char *) 0x7f0a78505cd6 ""

7f0a784e0000-7f0a78506000 r-xp 00000000 fd:01 660179 /lib/x86_64-linux-gnu/libexpat.so.1.6.0
7f0a78506000-7f0a78706000 ---p 00026000 fd:01 660179 /lib/x86_64-linux-gnu/libexpat.so.1.6.0

(gdb) p/x width
$37 = 0x876
(gdb) p/x src+width
$38 = 0x7f0a7850654c

Well that's why it's crashed - the src pointer is in the middle of expat and ends up running into the unreadable bit

(gdb) p dst
$34 = (unsigned char *) 0x7f0a770c8cc6 ""
Map entry: 7f0a76feb000-7f0a772cc000 rw-p 00000000 00:00 0

(gdb) p/x pCir->ShadowPtr
$30 = 0x7f0a78709010
(gdb) p/x pCir->FbBase
$31 = 0x7f0a772cc000
(gdb) p pCir->ShadowPtr-src
$24 = 2110266
(gdb) p pCir->FbBase-dst
$25 = 2110266
(gdb) p FBPitch
$26 = 3072

2110266/3072
686.9355468750

(P.S. as per previous instructions, try working through the 4 workspaces both clockwise and anti-clockwise)

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Some random extra debug (from a separate run):
(gdb) up
#3 0x00007f0485ef1af3 in ShadowCopyArea (pSrc=0x7f048ba948f0, pDst=0x7f048c4ca910, pGC=0x7f048c3def70,
    srcx=<optimised out>, srcy=<optimised out>, width=<optimised out>, height=434, dstx=0, dsty=0)
    at ../../../../hw/xfree86/shadowfb/shadow.c:618
618 ../../../../hw/xfree86/shadowfb/shadow.c: No such file or directory.

(gdb) p pDst
$1 = (DrawablePtr) 0x7f048c4ca910
(gdb) p *pDst
$2 = {type = 0 '\000', class = 1 '\001', depth = 32 ' ', bitsPerPixel = 32 ' ', id = 56623110, x = -958,
  y = -716, width = 722, height = 434, pScreen = 0x7f048b738810, serialNumber = 2538}
(gdb) p/x *pDst
$3 = {type = 0x0, class = 0x1, depth = 0x20, bitsPerPixel = 0x20, id = 0x3600006, x = 0xfc42, y = 0xfd34,
  width = 0x2d2, height = 0x1b2, pScreen = 0x7f048b738810, serialNumber = 0x9ea}
(gdb) p pGC
$4 = (GC *) 0x7f048c3def70
(gdb) p pGC->pCompositeClip
$5 = (RegionPtr) 0x7f048c4ca960
(gdb) p *pGC->pCompositeClip
$6 = {extents = {x1 = -958, y1 = -716, x2 = -236, y2 = -282}, data = 0x0}
(gdb) p *(WindowPtr)pDst
$7 = {drawable = {type = 0 '\000', class = 1 '\001', depth = 32 ' ', bitsPerPixel = 32 ' ', id = 56623110,
    x = -958, y = -716, width = 722, height = 434, pScreen = 0x7f048b738810, serialNumber = 2538},
  devPrivates = 0x7f048c4ca9e0, parent = 0x7f048c4c9b80, nextSib = 0x0, prevSib = 0x0,
  firstChild = 0x7f048c4cade0, lastChild = 0x7f048c4cade0, clipList = {extents = {x1 = -958, y1 = -716,
      x2 = -236, y2 = -282}, data = 0x0}, borderClip = {extents = {x1 = -958, y1 = -716, x2 = -236,
      y2 = -282}, data = 0x0}, valdata = 0x0, winSize = {extents = {x1 = -958, y1 = -716, x2 = -236,
      y2 = -282}, data = 0x0}, borderSize = {extents = {x1 = -958, y1 = -716, x2 = -236, y2 = -282},
    data = 0x0}, origin = {x = 0, y = 0}, borderWidth = 0, deliverableEvents = 32799, eventMask = 4423680,
  background = {pixmap = 0xfff2f1f0, pixel = 4294111728}, border = {pixmap = 0x0, pixel = 0},
  backStorage = 0x0, optional = 0x7f048c4caa20, backgroundState = 2, borderIsPixel = 1, cursorIsNone = 1,
  backingStore = 0, saveUnder = 0, DIXsaveUnder = 0, bitGravity = 1, winGravity = 1, overrideRedirect = 0,
  visibility = 0, mapped = 1, realized = 1, viewable = 1, dontPropagate = 0, forcedBS = 0, redirectDraw = 0,
  forcedBG = 0, damagedDescendants = 0, inhibitBGPaint = 0}

(Is the depth/bitsPerPixel correct? I thought this was running at 24bpp - maybe this is some intermediate).
Note how almost every x/y value there is -ve.

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

To simplify my instructions from comment #15:
  1) Open a terminal (sits in top left)
  2) Ctrl-alt-down arrow to move down one virtual deskop

It blows up at (2).

(pScrn->virtual X/y in cirRefreshArea is 1024/768)

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Confirmed still happening in Raring (That's a raring guest with Quantal host) as of today's install.

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hi,
  Attached is a patch that seems to fix this; however it needs looking at by someone who understands the code; in particular I have some questions:

  1) Does this also need to go in cirRefreshArea8/16/24/32 ?
  2) Why is the generic one being called if there are those specialised ones anyway?
  3) Is there something that will clip a BoxRec rather than having to use my func?

This was tested on 1.5.1-0ubuntu2 on raring.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Clip the range in cirRefreshArea" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Timo Aaltonen (tjaalton)
tags: added: raring
Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

well since nobody was forthcoming I've taken a look. The reason those specialized exist is because they're used for rotations.

I think your patch is slightly overdesigned, could you try if this patch works for you?

I also fixed up the other rotated versions

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hi Maarten,
  That patch isn't happy; but I'm not sure why yet; I've attached three screen captures:
    1) The version with the ubuntu built package - which shows the strange crosshatching from bug 1080674
    2) myversion.png - the version with my patch, looks the same as (1) but doesn't crash when I move around my virtual desktops
    3) lankhostversion.png (sorry - I typo'd your name !) - the version with your patch applied to a clean 1:1.5.1-0ubuntu2 built on raring.

So erm; that's odd; your version is very broken somehow (and also doesn't update as I type properly I think).
(Although it does fix bug 1080674 which I couldn't complain about!)

I can't immediately see a problem in your code; maybe the specialised versions need the min/max/x/y swapping
  if pCir->rotate is set?

As for mine being over engineered, obviously there is some personal preference in there; but I was working on the basis if it
was needed in all 4 versions then a function was a better bet, and shortClip v MIN/MAX, I tend to prefer functions and anyway the compiler should do what it thinks is best; but as I say that's personal preference.
Dave

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Hang on, scrap that analysis - I noticed the logs show it's not loading the rebuilt cirrus module with that change; It's now showing an EABI mismatch error. I'll get back to you.

Dave

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

and to compound things, the kernel driver is now working on raring, so I can't load this one to test it; let me get back to you next weekend.

Dave

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

You can disable the cirrus driver in raring by blacklisting it in /etc/modprobe.d/blacklist.conf or temporarily removing it from /lib/modules

Bryce Harrington (bryce)
Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Incomplete → New
Bryce Harrington (bryce)
Changed in xserver-xorg-video-cirrus (Ubuntu):
status: New → Incomplete
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Maarten:
  OK, it still crashes, and the reason is you're MIN/MAX aren't sufficient:

cirRefreshArea: pbox: (-958,52 / -236,486) clipped: (0,52 / -236,486) pScrn->vX/Y 1024,768 rotate=0

you need to use both MIN and MAX on each coordinate to cope with the box being completely off one side of the screen.

Still convinced my version is over-engineered?

Dave

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

ok I added a if (width <= 0 || height <= 0) continue; to cope with this, similar to other drivers. Can you retest?

Changed in xserver-xorg-video-cirrus (Ubuntu):
assignee: nobody → Maarten Lankhorst (mlankhorst)
status: Triaged → In Progress
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Thanks Maarten, that looks good; I guess with cirrus module loaded it's probably not actually that important on a raring release; but it's good to fix anyway, and I'd say backport to Quantal.

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

should be fixed in xf86-video-cirrus 1.5.2 then. I just pushed this upstream, and will do a release to raring shortly.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xserver-xorg-video-cirrus - 1:1.5.2-0ubuntu1

---------------
xserver-xorg-video-cirrus (1:1.5.2-0ubuntu1) raring; urgency=low

  * Sync from unreleased debian git.
    - Upstream release fixes cirRefreshArea SEGV (LP: #1043513)
  * Drop fix-fallback.diff, upstreamed.
 -- Maarten Lankhorst <email address hidden> Tue, 08 Jan 2013 11:21:32 +0100

Changed in xserver-xorg-video-cirrus (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Thanks Maarten; that does seem to have nailed it in Raring; SRU for Quantal?

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

Can you reproduce this on precise too? Might be worth it to have it there as well..

description: updated
Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

Anyway if you can fill out the testcase part, I should be able to get it sru'd for quantal, and if that works maybe precise as well.

Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Maarten: Filled in test case stuff (what did you mean about eth2?)
                  I wonder if this should be marked security; we know it's scribbling over bits of the X server, that's running as root,
                  although I don't know how to control what is scribbled where

(I'll get precise tested soon; just as soon as I convince LVM to allocate the disk space...)

description: updated
Revision history for this message
Dave Gilbert (ubuntu-treblig) wrote :

Precise won't get Unity 3D up (even with some forcing); so I'll say not repeatable.
[If as per previous question it's a security issue though that may need some more looking at]

To post a comment you must log in.