Mir

[regression] Test failure holding up all merge proposals: "File already exists in database: mir_protobuf_wire.proto"

Bug #1358698 reported by Alan Griffiths
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mir
Fix Released
Critical
Alan Griffiths
0.6
Invalid
Undecided
Unassigned
0.7
Fix Released
Critical
Alan Griffiths
mir (Ubuntu)
Fix Released
Critical
Unassigned

Bug Description

We;ve seen a failure on two CI branches and briefly locally (by Kevin DuBois).

> [libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: mir_protobuf_wire.proto
> [libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
> terminate called after throwing an instance of 'google::protobuf::FatalException'
> what(): CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
> Aborted (core dumped)

https://code.launchpad.net/~raof/mir/more-tiny-improvements/+merge/231304/comments/562598
https://code.launchpad.net/~alan-griffiths/mir/integration-and-unit-tests-link-against-internals/+merge/230810/comments/562173

Related branches

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

This seems more frequent on mako, but isn't exclusive to it. E.g.

https://jenkins.qa.ubuntu.com/job/mir-team-mir-development-branch-utopic-amd64-ci/949/console

[ RUN ] SharedLibrary.load_valid_library_works
[libprotobuf ERROR google/protobuf/descriptor_database.cc:57] File already exists in database: mir_protobuf_wire.proto
[libprotobuf FATAL google/protobuf/descriptor.cc:954] CHECK failed: generated_database_->Add(encoded_file_descriptor, size):
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what(): CHECK failed: generated_database_->Add(encoded_file_descriptor, size):

Changed in mir:
status: New → In Progress
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Predictably, the error happens if you have multiple copies of your protobuf-compiled objects linked into the same process. So somewhere either by static linkage or an OBJECT library we have introduced a redundant copy of the Mir protobuf objects into our binaries...

[https://code.google.com/p/protobuf/issues/detail?id=370]

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Or by dlopen'ing...

summary: - Test failure: File already exists in database: mir_protobuf_wire.proto
+ Test failure holding up all merge proposals: File already exists in
+ database: mir_protobuf_wire.proto
Changed in mir:
importance: High → Critical
Revision history for this message
Daniel van Vugt (vanvugt) wrote : Re: Test failure holding up all merge proposals: File already exists in database: mir_protobuf_wire.proto

Judging by the fact CI still works with the 0.6 branch, it appears this bug was only introduced recently in development-branch.

summary: - Test failure holding up all merge proposals: File already exists in
- database: mir_protobuf_wire.proto
+ [regression] Test failure holding up all merge proposals: File already
+ exists in database: mir_protobuf_wire.proto
tags: added: regression
Revision history for this message
Alan Griffiths (alan-griffiths) wrote : Re: [regression] Test failure holding up all merge proposals: File already exists in database: mir_protobuf_wire.proto

The bug appears to have been trigged by the reorganisation of binaries and symbol hiding in 0.7.

Possibly rolling libmirprotobuf into libmircommon, possibly forcing binaries into libmircommon & libmirplatform, possibly just the hiding of some symbol. But I've not been able to identify what causes it to manifest nor the mechanism by which it happens.

It is really frustrating that I can't reproduce locally as that makes for a very slow test cycle through submitting trial MPs.

In the hope of unblocking other work I'll put together an MP that reverts all this reorganisation and continue investigations into the bug separately.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

> In the hope of unblocking other work I'll put together an MP that reverts all this reorganisation and continue investigations into the bug separately.

Actually, https://code.launchpad.net/~vanvugt/mir/revert-1848/+merge/231520 seems to cover everything recent enough to be implicated. Awaiting results...

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Verified again the 0.6 branch is unaffected by this bug (https://code.launchpad.net/~vanvugt/mir/dummy-0.6/+merge/231505)

So we possibly introduced it after r1831 (the final merge of devel into 0.6). Although if it was a regression in the development-branch, it's unclear how it passed CI then but does not now.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

I've worked out a way to replicate the symptoms. Just not sure yet how it can be affecting CI:

I start by removing the libmirplatformgraphics.so in lib
Then when running e.g. mir_integration_tests the installed version of libmirplatformgraphics.so is picked up.
mir_integration_tests links to a local libmircommon.so.2 (as that's what is being built since -r1846
But the installed version of libmirplatformgraphics.so links to the installed libmircommon.so.1
And and two different libmircommon.so.* both try to register with libprotobuf
Which that gives us "libprotobuf ERROR..."

The question therefore is: could we pick up the wrong libmirplatformgraphics.so in CI?

Bugs in CI are not unknown.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

OK. There are two failure scenarios to this bug.

1. A failure during test discovery - this is a race condition where discovery happens before libmirplatformgraphics.so is built. (This is comment #1)

2. The tests on mako dlopen the image libmirplatformgraphics.so, not the built one (bug 1359760)

In both cases the libmirplatformgraphics.so installed on the system is loaded instead of the one being built. This has only caused a noticeable problem because libmircommon.so is now versioned and the version in the archive was different to that in the build.

lp:~alan-griffiths/mir/fix-1358698 will unblock CI and lp:1359760 should track the remaining issue.

summary: - [regression] Test failure holding up all merge proposals: File already
- exists in database: mir_protobuf_wire.proto
+ [regression] Test failure holding up all merge proposals: "File already
+ exists in database: mir_protobuf_wire.proto"
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

Fix committed into lp:mir/devel at revision None, scheduled for release in mir, milestone Unknown

Changed in mir:
status: In Progress → Fix Committed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Fix committed to lp:mir/0.7 at revision 1870, scheduled for release in Mir 0.7.0

Changed in mir:
milestone: 0.7.0 → 0.8.0
Changed in mir (Ubuntu):
importance: Undecided → Critical
status: New → Triaged
Changed in mir:
milestone: 0.8.0 → 0.7.0
Changed in mir:
milestone: 0.7.0 → 0.8.0
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.2 KiB)

This bug was fixed in the package mir - 0.7.0+14.10.20140829-0ubuntu1

---------------
mir (0.7.0+14.10.20140829-0ubuntu1) utopic; urgency=medium

  [ Daniel van Vugt ]
  * New upstream release 0.7.0 (https://launchpad.net/mir/+milestone/0.7.0)
    - Enhancements:
      . Test suite: Reworked mechanism to override Mir client functions
      . Demo shell: Detect custom rendering (decorations) to make it
        compatible with overlay optimizations
      . Make sure to preserve fd resources until the end of the sending
        of the message
      . Add test cases and script for tracking changes to the new ABIs:
        libmircommon, libmirplatform
      . Symbols file for libmirplatform
      . Symbols file for libmircommon
      . Symbols file for libmirserver
      . Various improvements to the SessionMediator test
      . Various build related improvements
      . Print testcase output during package build
      . Abort test when InProcessServer startup fails
      . Link the integration and unit tests against the server objects
      . Add a document detailing the useful tests to run and the useful
        logs to collect when troubleshooting a new android chipset
      . Enable motion event resampling and prediction for a more responsive
        touch experience.
    - ABI summary: Servers need rebuilding, but clients do not
      . Mirclient ABI unchanged at 8
      . Mircommon ABI bumped to 1
      . Mirplatform ABI bumped to 2
      . Mirserver ABI bumped to 25
    - API changes
      . Deleted function - frontend::Shell::create_surface_for(). If you have
        the std::shared_ptr<frontend::Session> session, you can just do
        session->create_surface(params) instead to get a SurfaceId
    - Bug fixes:
      . Ensure we process lifecycle events before the nested server is torn
        down (LP: #1353465)
      . Fix race in InputTestingServerConfiguration (LP: #1354446)
      . Fix fd leaks in prompt session frontend code and tests (LP: #1353461)
      . Detect the additional things the demo shell draws on the renderable
        list and avoid calling the optimized post function if they are being
        drawn (LP: #1348330)
      . Client: Fix SIGTERM dispatch in our default lifecycle event handler
        (LP: #1353867)
      . DemoRenderer: Don't try to create a texture of width zero.
        (LP: #1358210)
      . Fix CI failures (LP: #1358698)
      . Fix build failure: "variable ‘rc’ set but not used" which happens in
        release mode when NDEBUG is set (LP: #1358625)
      . Only enumerate exposed input surfaces to avoid delivering events to
        occluded surfaces (LP: #1359264)
      . Android: do not post driver cancelled buffers (LP: #1359406)
      . Client: Ensure our platform library stays loaded for as long as it is
        needed by other objects (LP: #1358191)
      . Examples: Register the DemoCompositor with the Scene to properly
        process visibility events (LP: #1359487)
      . Mir_demo_client_basic: Don't assert on user errors like failing to
        connect to a Mir server (LP: #1331958)
      . Tests: Explicitly depend on GMock target to avoid build races
        (LP: #1362646)

  [ Ubuntu dai...

Read more...

Changed in mir (Ubuntu):
status: Triaged → Fix Released
Changed in mir:
milestone: 0.8.0 → none
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.