Mir

[regression] [testsfail] failure in CI on ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption under Valgrind

Bug #1499229 reported by Alan Griffiths
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mir
Fix Released
High
Alberto Aguirre
mir (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Opening a new bug, rather than recycling lp:1441620

9: [ RUN ] ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption
9: ==26614==
9: ==26614== FILE DESCRIPTORS: 2 open at exit.
9: ==26614== Open file descriptor 2:
9: ==26614== <inherited from parent>
9: ==26614==
9: ==26614== Open file descriptor 1:
9: ==26614== <inherited from parent>
9: ==26614==
9: ==26614==
9: ==26614== HEAP SUMMARY:
9: ==26614== in use at exit: 95,523 bytes in 757 blocks
9: ==26614== total heap usage: 843,952 allocs, 843,195 frees, 135,376,735 bytes allocated
9: ==26614==
9: ==26614== LEAK SUMMARY:
9: ==26614== definitely lost: 0 bytes in 0 blocks
9: ==26614== indirectly lost: 0 bytes in 0 blocks
9: ==26614== possibly lost: 852 bytes in 22 blocks
9: ==26614== still reachable: 94,179 bytes in 729 blocks
9: ==26614== of which reachable via heuristic:
9: ==26614== newarray : 1,244 bytes in 39 blocks
9: ==26614== suppressed: 0 bytes in 0 blocks
9: ==26614== Reachable blocks (those to which a pointer was found) are not shown.
9: ==26614== To see them, rerun with: --leak-check=full --show-leak-kinds=all
9: ==26614==
9: ==26614== For counts of detected and suppressed errors, rerun with: -v
9: ==26614== Use --track-origins=yes to see where uninitialised values come from
9: ==26614== ERROR SUMMARY: 2443 errors from 8 contexts (suppressed: 129883 from 41)
9: /tmp/buildd/mir-0.18.0bzr3187pkg0xenial129+autopilot0/tests/unit-tests/dispatch/test_threaded_dispatcher.cpp:382: Failure
9: Value of: result.succeeded()
9: Actual: false
9: Expected: true
9: [ FAILED ] ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption (2710 ms)

Tags: testsfail

Related branches

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Once again the problem appears to lie in valgrind errors from an earlier test: BasicThreadPool.*

Changed in mir:
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Alan Griffiths (alan-griffiths)
milestone: none → 0.17.0
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Also, worth noting the mir-wily-i386-ci and mir-mediumtests-builder-wily-armhf jobs are new and have *never* passed.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

OK, these builds are flagged "non-fatal" in CI, so they're non-blocking.

Changed in mir:
assignee: Alan Griffiths (alan-griffiths) → nobody
importance: Critical → Medium
status: In Progress → Confirmed
status: Confirmed → Triaged
milestone: 0.17.0 → none
summary: - [regression] [testsfail] failure in CI on
- SimpleDispatchThreadTest.keeps_dispatching_after_signal_interruption
+ [regression] [testsfail] failure in CI on
+ ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption
Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote : Re: [regression] [testsfail] failure in CI on ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption

Started seeing this fail consistently in 'mandatory' build configurations (e.g. https://jenkins.qa.ubuntu.com/job/mir-wily-amd64-ci/1455/console). Marked the test as DISABLED for the moment.

Changed in mir:
importance: Medium → High
Revision history for this message
Cemil Azizoglu (cemil-azizoglu) wrote :

I cannot repro the failure on my laptop - even under heavy load.

tags: added: testsfail
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Changed in mir:
milestone: none → 0.19.0
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
Changed in mir:
assignee: nobody → Chris Halse Rogers (raof)
Changed in mir:
assignee: Chris Halse Rogers (raof) → nobody
status: Triaged → Confirmed
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I've been testing empty merge proposals against Jenkins these past couple of days and this bug is the only CI failure that seems to happen fairly consistently.

Although only on xenial touch (armhf) with valgrind (???)

Changed in mir:
status: Confirmed → Triaged
Changed in mir:
assignee: nobody → Daniel van Vugt (vanvugt)
status: Triaged → In Progress
summary: [regression] [testsfail] failure in CI on
ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption
+ under Valgrind
description: updated
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

It seems the test in question may not be the problem. Disabling that test with a memcheck filter results in the same number of valgrind failures at the end of mir_unit_tests. So they're happening sooner.

Now I look back through them, it seems like most of the issues were what Cemil just fixed and landed. So wait and see how that fix helps...

The confusion about this bug seems to stem from the errors occurring before ThreadedDispatcherSignalTest.keeps_dispatching_after_signal_interruption starts, forks and returns. So the child process is always seen as failing despite the fact the errors didn't occur in that test at all, but occurred before it started. So I'm assuming forked processes inherit the valgrind error list from their parents.

Changed in mir:
status: In Progress → Incomplete
Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Daniel, see comment #1

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Seems to be fixed. Let it expire just to be sure...

Changed in mir:
milestone: 0.19.0 → none
assignee: Daniel van Vugt (vanvugt) → nobody
assignee: nobody → Daniel van Vugt (vanvugt)
assignee: Daniel van Vugt (vanvugt) → nobody
Revision history for this message
Kevin DuBois (kdub) wrote :
Changed in mir:
status: Incomplete → Triaged
Changed in mir:
assignee: nobody → Alberto Aguirre (albaguirre)
milestone: none → 0.22.0
status: Triaged → In Progress
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

Fix committed into lp:mir at revision None, scheduled for release in mir, milestone 0.22.0

Changed in mir:
status: In Progress → Fix Committed
Changed in mir:
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mir - 0.22.1+16.04.20160516.2-0ubuntu2

---------------
mir (0.22.1+16.04.20160516.2-0ubuntu2) yakkety; urgency=medium

  [ Dimitri John Ledkov ]
  * Fix FTBFS error: call of overloaded ‘abs(float)’ is ambiguous, by
    including cmath c++ header.

 -- Łukasz 'sil2100' Zemczak <email address hidden> Thu, 19 May 2016 21:58:43 +0200

Changed in mir (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.