Mir

Client buffers eventually display one frame behind client swap requests

Bug #1216337 reported by Sam Spilsbury
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Mir
Expired
Medium
Unassigned

Bug Description

I've not been able to figure out exactly what causes this problem yet - but it appears that there is some sort of condition that can cause client buffers to display one-frame behind their swap count, eg:

swapN -> swapN + 1 -> swapN + 2
drawN - 1 -> drawN -> drawN + 1

I haven't been able to figure out a programmatic way to reproduce it, but gtk+ seems to be a good testcase so far:

http://github.com/smspillaz/gtk [wip/mir]

./autogen.sh --enable-mir-backend
gdb ./demos/widget-factory/.libs/gtk3-widget-factory

Click on the "Page 2" tab and click on various widgets until there is visible input lag. Once there is input lag, you can break in mir_surface_swap_buffers, switch to frame #1 (f 1) and set frame_debugnum to a nonzero value to save frame images to your home directory. It should be the case that each dumped frame will have the right contents, but after the call to swap_buffers completes, that will not the case on-screen.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sam, the existing manual test case for this sort of thing is to run demo_client_fingerpaint and:
  1. Drag/draw for a while.
  2. Click single clicks in various locations.
Expect: Each single click (which is a single swap) results in a new colour being painted in that spot. Hence, no lag.

How does that go for you?

Of course, I thought the automated tests covered this too...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Oh, except you can't run fingerpaint since r990 due to bug 1215754. :(

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

This sounds like bug 1216472 I noticed today.

Sam, are you using trunk lp:mir ?

Revision history for this message
Sam Spilsbury (smspillaz) wrote : Re: [Bug 1216337] Re: Client buffers eventually display one frame behind client swap requests

Hi Daniel,

Just confirmed it on trunk lp:mir today. Sounds similar to 1216472.
Unfortunately I don't have the time to really dig into it any further.

If you're unable to reproduce the bug using fingerpaint then it might be
worth testing using gtk+ itself - I had a roll my own damage-and-swap
algorithm there using GdkFrameClock and it could well be a timing issue or
race condition that only gtk+ is able to trigger at the moment. Notably -
we're not using mir_surface_swap_buffers_sync - instead I'm using the
callback along with a GAsyncQueue to push each new MirGraphicsRegion to the
main thread.

As I mentioned, I'm pretty sure the issue is in the server itself -
verified by the fact that doing an image dump of the buffer contents before
calling mir_surface_swap_buffers shows that the contents are indeed what is
expected before the swap.

The only thing I could think of which might be broken on the client side
would be if we had called mir_surface_swap_buffers twice and operated on an
older MirGraphicsRegion. However, I believe that the graphics region gets
unmapped from the process as soon as you call mir_surface_swap_buffers.

On Sun, Aug 25, 2013 at 12:05 PM, Daniel van Vugt <
<email address hidden>> wrote:

> This sounds like bug 1216472 I noticed today.
>
> Sam, are you using trunk lp:mir ?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1216337
>
> Title:
> Client buffers eventually display one frame behind client swap
> requests
>
> Status in Mir:
> New
>
> Bug description:
> I've not been able to figure out exactly what causes this problem yet
> - but it appears that there is some sort of condition that can cause
> client buffers to display one-frame behind their swap count, eg:
>
> swapN -> swapN + 1 -> swapN + 2
> drawN - 1 -> drawN -> drawN + 1
>
> I haven't been able to figure out a programmatic way to reproduce it,
> but gtk+ seems to be a good testcase so far:
>
> http://github.com/smspillaz/gtk [wip/mir]
>
> ./autogen.sh --enable-mir-backend
> gdb ./demos/widget-factory/.libs/gtk3-widget-factory
>
> Click on the "Page 2" tab and click on various widgets until there is
> visible input lag. Once there is input lag, you can break in
> mir_surface_swap_buffers, switch to frame #1 (f 1) and set
> frame_debugnum to a nonzero value to save frame images to your home
> directory. It should be the case that each dumped frame will have the
> right contents, but after the call to swap_buffers completes, that
> will not the case on-screen.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mir/+bug/1216337/+subscriptions
>

--
Sam Spilsbury

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sam,

Can you double check that your getting of the backbuffer and swapbuffers are properly synchronized? If not as simplistically as fingerpaint does, then by other means?...

    mir_surface_get_graphics_region(surface, &backbuffer);
    copy_region(&backbuffer, canvas);
    mir_surface_swap_buffers_sync(surface);

Revision history for this message
Sam Spilsbury (smspillaz) wrote :

Hi Daniel. I've already verified this is the case - see the description on
how to reproduce it and verify it.

I don't think I can reproduce it with that kind of example - it appears to
be a race condition.
On 25/08/2013 2:25 PM, "Daniel van Vugt" <email address hidden>
wrote:

> Sam,
>
> Can you double check that your getting of the backbuffer and swapbuffers
> are properly synchronized? If not as simplistically as fingerpaint does,
> then by other means?...
>
> mir_surface_get_graphics_region(surface, &backbuffer);
> copy_region(&backbuffer, canvas);
> mir_surface_swap_buffers_sync(surface);
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1216337
>
> Title:
> Client buffers eventually display one frame behind client swap
> requests
>
> Status in Mir:
> New
>
> Bug description:
> I've not been able to figure out exactly what causes this problem yet
> - but it appears that there is some sort of condition that can cause
> client buffers to display one-frame behind their swap count, eg:
>
> swapN -> swapN + 1 -> swapN + 2
> drawN - 1 -> drawN -> drawN + 1
>
> I haven't been able to figure out a programmatic way to reproduce it,
> but gtk+ seems to be a good testcase so far:
>
> http://github.com/smspillaz/gtk [wip/mir]
>
> ./autogen.sh --enable-mir-backend
> gdb ./demos/widget-factory/.libs/gtk3-widget-factory
>
> Click on the "Page 2" tab and click on various widgets until there is
> visible input lag. Once there is input lag, you can break in
> mir_surface_swap_buffers, switch to frame #1 (f 1) and set
> frame_debugnum to a nonzero value to save frame images to your home
> directory. It should be the case that each dumped frame will have the
> right contents, but after the call to swap_buffers completes, that
> will not the case on-screen.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mir/+bug/1216337/+subscriptions
>

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I'm not yet able to find a solid theoretical reason as to how this could happen, though I think SwitchingBundle::force_requests_to_complete() could be a candidate for breaking the sync...

1. Does VT switching trigger it or make it worse?
2. Using multiple monitors?
3. Do you find that like bug 1216472, the frames are disordered, or just lag?

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Also,

4. Are you framedropping? I mean, have you changed the surface swapinterval?

Revision history for this message
Sam Spilsbury (smspillaz) wrote :

Hi Daniel.

Yes, I'm using multiple monitors, but I'm not frame dropping. Its just a
one-frame lag.

If you know where to start debugging, maybe I can have a look myself?

Were you able to reproduce the problem with gtk+?
On 25/08/2013 4:35 PM, "Daniel van Vugt" <email address hidden>
wrote:

> Also,
>
> 4. Are you framedropping? I mean, have you changed the surface
> swapinterval?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1216337
>
> Title:
> Client buffers eventually display one frame behind client swap
> requests
>
> Status in Mir:
> New
>
> Bug description:
> I've not been able to figure out exactly what causes this problem yet
> - but it appears that there is some sort of condition that can cause
> client buffers to display one-frame behind their swap count, eg:
>
> swapN -> swapN + 1 -> swapN + 2
> drawN - 1 -> drawN -> drawN + 1
>
> I haven't been able to figure out a programmatic way to reproduce it,
> but gtk+ seems to be a good testcase so far:
>
> http://github.com/smspillaz/gtk [wip/mir]
>
> ./autogen.sh --enable-mir-backend
> gdb ./demos/widget-factory/.libs/gtk3-widget-factory
>
> Click on the "Page 2" tab and click on various widgets until there is
> visible input lag. Once there is input lag, you can break in
> mir_surface_swap_buffers, switch to frame #1 (f 1) and set
> frame_debugnum to a nonzero value to save frame images to your home
> directory. It should be the case that each dumped frame will have the
> right contents, but after the call to swap_buffers completes, that
> will not the case on-screen.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mir/+bug/1216337/+subscriptions
>

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Alright, can you please try without extra monitors? Either unplug it or:
    mir_demo_server_shell --display-config single

Sorry, I can't think of suggestions for what/how to debug right now.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Sorry, I have not got to playing with GTK yet.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I suspect we might be reporting incorrect buffer age in some cases, which is a feature only used in XMir and your GTK. Maybe try ignoring buffer age?

Revision history for this message
Sam Spilsbury (smspillaz) wrote :

Yeah, it does seem that if I always redraw the whole frame it seems to
work.
On 25/08/2013 5:45 PM, "Daniel van Vugt" <email address hidden>
wrote:

> I suspect we might be reporting incorrect buffer age in some cases,
> which is a feature only used in XMir and your GTK. Maybe try ignoring
> buffer age?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1216337
>
> Title:
> Client buffers eventually display one frame behind client swap
> requests
>
> Status in Mir:
> New
>
> Bug description:
> I've not been able to figure out exactly what causes this problem yet
> - but it appears that there is some sort of condition that can cause
> client buffers to display one-frame behind their swap count, eg:
>
> swapN -> swapN + 1 -> swapN + 2
> drawN - 1 -> drawN -> drawN + 1
>
> I haven't been able to figure out a programmatic way to reproduce it,
> but gtk+ seems to be a good testcase so far:
>
> http://github.com/smspillaz/gtk [wip/mir]
>
> ./autogen.sh --enable-mir-backend
> gdb ./demos/widget-factory/.libs/gtk3-widget-factory
>
> Click on the "Page 2" tab and click on various widgets until there is
> visible input lag. Once there is input lag, you can break in
> mir_surface_swap_buffers, switch to frame #1 (f 1) and set
> frame_debugnum to a nonzero value to save frame images to your home
> directory. It should be the case that each dumped frame will have the
> right contents, but after the call to swap_buffers completes, that
> will not the case on-screen.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mir/+bug/1216337/+subscriptions
>

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Yeah it's starting to look like buffer age + multi-monitor = bugs.

Revision history for this message
Sam Spilsbury (smspillaz) wrote :

Hmm. I did just try it now with a single monitor and I'm still seeing the
lag. Also I'm not so sure about buffer age being the problem - if I dump
the frame just before calling mir_surface_swap_buffers the contents of the
frame are entirely correct. There'd be parts of old frames if it were not.

On Sun, Aug 25, 2013 at 6:14 PM, Daniel van Vugt <
<email address hidden>> wrote:

> Yeah it's starting to look like buffer age + multi-monitor = bugs.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1216337
>
> Title:
> Client buffers eventually display one frame behind client swap
> requests
>
> Status in Mir:
> New
>
> Bug description:
> I've not been able to figure out exactly what causes this problem yet
> - but it appears that there is some sort of condition that can cause
> client buffers to display one-frame behind their swap count, eg:
>
> swapN -> swapN + 1 -> swapN + 2
> drawN - 1 -> drawN -> drawN + 1
>
> I haven't been able to figure out a programmatic way to reproduce it,
> but gtk+ seems to be a good testcase so far:
>
> http://github.com/smspillaz/gtk [wip/mir]
>
> ./autogen.sh --enable-mir-backend
> gdb ./demos/widget-factory/.libs/gtk3-widget-factory
>
> Click on the "Page 2" tab and click on various widgets until there is
> visible input lag. Once there is input lag, you can break in
> mir_surface_swap_buffers, switch to frame #1 (f 1) and set
> frame_debugnum to a nonzero value to save frame images to your home
> directory. It should be the case that each dumped frame will have the
> right contents, but after the call to swap_buffers completes, that
> will not the case on-screen.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/mir/+bug/1216337/+subscriptions
>

--
Sam Spilsbury

Revision history for this message
Chris Halse Rogers (raof) wrote :

Yeah, I don't think it's the buffer_age that's the problem; we've done some experimental redraw-every-framing in XMir and it doesn't help.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

The above comment was intended for bug 1216472. :)

Changed in mir:
importance: Undecided → High
Revision history for this message
kevin gunn (kgunn72) wrote :

so is this xmir only ?
seems from the comments it is. marking as such, please correct if wrong.

summary: - Client buffers eventually display one frame behind client swap requests
+ [xmir] Client buffers eventually display one frame behind client swap
+ requests
tags: added: xmir
Changed in mir:
importance: High → Medium
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Unrelated to xmir. This bug is about the native GTK port to Mir.

summary: - [xmir] Client buffers eventually display one frame behind client swap
- requests
+ Client buffers eventually display one frame behind client swap requests
tags: removed: xmir
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I think this would have been resolved with the fix for bug 1199450. Even if the root cause was different, that fix should make the problem described here impossible. But incomplete as I can't prove it.

Changed in mir:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Mir because there has been no activity for 60 days.]

Changed in mir:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.