Mir

Flickering showing stale buffers on Krillin

Bug #1444047 reported by Alexandros Frantzis
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
High
Kevin DuBois
mir (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Occasionally after a restart I get flickering showing stale buffers on Krillin with the latest vivid-proposed image (mir 0.12.1).

A video showing the problem: http://people.ubuntu.com/~afrantzis/krillin-strange-flicker.mkv

The HWC log indicates that we somehow end up with an unexpected buffer when compositing:

set list():
 # | handle
 0 | 0x3f4770 <====== This buffer is the odd one out
 1 | 0x3f4188

...

set list():
 # | handle
 0 | 0xac202978
 1 | 0x3f4188

...

set list():
 # | handle
 0 | 0xac203f60
 1 | 0x3f4188

Related branches

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Could be related to stale frame bug 1270245 that also affects krillin.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Is this with standard images or built yourself? If standard (Mir 0.12) then I'd say this is probably related to bug 1270245.

Changed in mir:
status: New → Incomplete
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

> Is this with standard images or built yourself? If standard (Mir 0.12) then I'd say this is probably related to bug 1270245.

This is with 0.12.1 (latest vivid-proposed image).

Changed in mir:
status: Incomplete → New
description: updated
Changed in mir:
status: New → In Progress
Changed in mir:
milestone: none → 0.13.0
Revision history for this message
Alexandros Frantzis (afrantzis) wrote :

A much easier way to reproduce the problem:

https://code.launchpad.net/~afrantzis/mir/reproduce-1444047/

Run a client (e.g. mir_demo_client_egltriangle) against a demo server built with the branch above.

Revision history for this message
Kevin DuBois (kdub) wrote :

Interestingly, this happens with --disable-overlays=true, takes some of the moving moving parts out of the system.

Changed in mir:
assignee: Alexandros Frantzis (afrantzis) → Kevin DuBois (kdub)
Revision history for this message
Kevin DuBois (kdub) wrote :

At about the same time that the re-allocation sequence happens on the server, the client starts to produce black/blank frames.

Revision history for this message
Kevin DuBois (kdub) wrote :

Setting NATIVE_WINDOW_MIN_UNDEQUEUED_BUFFERS to 0 in egl_native_surface_interpreter.cpp does seem to avert the problem, however this doesn't seem quite right to do in general.

What is happening is the first time the client sees the newly-allocated frame, it registerBuffers() the new buffer without error, and then the driver dequeue()s the buffer, and then queue the buffer, but doesn't write to the buffer.

Revision history for this message
Kevin DuBois (kdub) wrote :

Best guess at this point is that the driver has a pipeline of buffers its working on, and the introduction of a new, same-sized buffer mid-flight causes the driver not to write to the buffer.

tags: added: android krillin
Revision history for this message
Kevin DuBois (kdub) wrote :

Seems the root cause is that the driver is requesting and remembering that specific number of buffers are available (in this case, 5 buffers). If one of the buffers the driver has remembered gets unregistered in gralloc, then the incoming buffer that takes the slot of the remembered+unregistered buffer causes the driver to pop the new buffer out without filling it. This gives us a flickering scenario sometimes (where we're toggling between an affected and unaffected buffers), and sometimes can cause the client to not appear on the screen at all (when enough framedrop-allocations have happened to affect all the buffers the driver is using). fix-in-progress....

I don't think at first glance that this related to any lingering out-of-order bugs.

Revision history for this message
Kevin DuBois (kdub) wrote :

"I don't think at first glance that this related to any lingering out-of-order bugs."
eh, thinking a bit more, it seems something that could be related... will check if #1270245 is still happening with the patch

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Tested. Sadly bug 1270245 still happens with the patch.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.13.0

Changed in mir:
status: In Progress → Fix Committed
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.13.1+15.10.20150520-0ubuntu1

---------------
mir (0.13.1+15.10.20150520-0ubuntu1) wily; urgency=medium

  [ Cemil Azizoglu ]
  * New upstream release 0.13.1 (https://launchpad.net/mir/+milestone/0.13.1)
    - ABI summary: No ABI break. Servers and clients do not need rebuilding.
      . Mirclient ABI unchanged at 8
      . Mircommon ABI unchanged at 4
      . Mirplatform ABI unchanged at 7
      . Mirserver ABI unchanged at 31
    - Bug fixes:
      . Can't load app purchase UI without a U1 account (LP: #1450377)
      . Crash because uncaught exception in mir::events::add_touch (LP: #1437357)

 -- CI Train Bot <email address hidden> Wed, 20 May 2015 21:20:15 +0000

Changed in mir (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.