selftest --parallel test case timings incorrect

Bug #625594 reported by Martin Packman
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
Medium
Martin Packman
testtools
Fix Released
Undecided
Martin Packman

Bug Description

Individual test timings appear to be completely off when selftest is run across multiple processes.

Running two reasonably slow tests:

    blackbox.test_log.TestLogTags.test_log_with_tags OK 483ms
    blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 468ms
    ----------------------------------------------------------------------
    Ran 2 tests in 1.000s

But with --parallel=subprocess (and BZR_CONCURRENCY=2):

    blackbox.test_log.TestLogTags.test_log_with_tags OK 0ms
    blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 0ms
    ----------------------------------------------------------------------
    Ran 2 tests in 3.531s

The timer is doing *something* as occasionally it thinks a test takes 15ms rather than 0ms, but clearly this is not actually timing how long the test is taking.

Does --parallel=fork have a similar issue? It appears to be using the same subunit.test_results.AutoTimingTestResultDecorator which I presume is responsible.

Tags: selftest

Related branches

Revision history for this message
Vincent Ladeuil (vila) wrote :

This is what I get:

./bzr selftest -s blackbox.test_log.TestLogTags.test_log_with_tags -s blackbox.test_log.TestLogTags.test_merged_log_with_tags -v
bzr: ERROR: blackbox is not a known test prefix alias
vila:~/src/bzr/trunk :( $ ./bzr selftest -s bb.test_log.TestLogTags.test_log_with_tags -s bb.test_log.TestLogTags.test_merged_log_with_tags -v
running 0 tests...
bzr selftest: /home/vila/src/bzr/trunk/bzr
   /home/vila/src/bzr/trunk/bzrlib
   bzr-2.3.0dev1 python-2.6.5 Linux-2.6.32-24-generic-x86_64-with-Ubuntu-10.04-lucid

blackbox.test_log.TestLogTags.test_log_with_tags OK 103ms
blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 113ms
----------------------------------------------------------------------
Ran 2 tests in 0.221s

OK
vila:~/src/bzr/trunk :) $ ./bzr selftest -s bb.test_log.TestLogTags.test_log_with_tags -s bb.test_log.TestLogTags.test_merged_log_with_tags -v --parallel=fork
running 0 tests...
bzr selftest: /home/vila/src/bzr/trunk/bzr
   /home/vila/src/bzr/trunk/bzrlib
   bzr-2.3.0dev1 python-2.6.5 Linux-2.6.32-24-generic-x86_64-with-Ubuntu-10.04-lucid

blackbox.test_log.TestLogTags.test_log_with_tags OK 0ms
blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 0ms
----------------------------------------------------------------------
Ran 2 tests in 0.168s

OK

Changed in bzr:
status: New → Confirmed
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 625594] [NEW] selftest --parallel test case timings incorrect

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/27/2010 6:40 PM, Martin [gz] wrote:
> Public bug reported:
>
> Individual test timings appear to be completely off when selftest is run
> across multiple processes.
>
> Running two reasonably slow tests:
>
> blackbox.test_log.TestLogTags.test_log_with_tags OK 483ms
> blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 468ms
> ----------------------------------------------------------------------
> Ran 2 tests in 1.000s
>
> But with --parallel=subprocess (and BZR_CONCURRENCY=2):
>
> blackbox.test_log.TestLogTags.test_log_with_tags OK 0ms
> blackbox.test_log.TestLogTags.test_merged_log_with_tags OK 0ms
> ----------------------------------------------------------------------
> Ran 2 tests in 3.531s
>
> The timer is doing *something* as occasionally it thinks a test takes
> 15ms rather than 0ms, but clearly this is not actually timing how long
> the test is taking.

On many Windows platforms the 'time.time()' resolution is 15ms. Which is
why you could see it fluctuate from 0 to 15ms, but not something small
like 1-2ms.

That said, there certainly seems to be something odd going on for it to
be succeeding that quickly.

Offhand... subunit seems to take the last 'time' statement as the start
of this test. Maybe when it is joining up the child streams, it
intermingles the 'time' commands?

For example, imagine in a regular run:

time 1
test a
time 2

time 3
test b
time 4

however, when running with forking or parallel you end up with:
time 1a
time 1b
test a
time 2a
test b
time 2b

And 2b - 2a is very fast.

Note also that --subunit only streams clock times. It doesn't send out
delta times, etc.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx7298ACgkQJdeBCYSNAANNLwCgxKoggqz6sjIsqtMKC+wjQnSY
2NYAnRZXYAumzbIUIoHsGjUGoX4QRpAx
=DnRI
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

I think its the testtools interleaving code at fault - mea culpa.

Revision history for this message
Martin Packman (gz) wrote :

The most immediate cause of this issue is code in testtools.testresult.real.ThreadsafeForwardingResult like:

    def addError(self, test, err=None, details=None):
        self.semaphore.acquire()
        try:
            self.result.startTest(test)
            self.result.addError(test, err, details=details)
            self.result.stopTest(test)
        finally:
            self.semaphore.release()

It's written like that to prevent different test results being interleaved in the output, but as a side effect means start/add/stop happen all at the same time from the perspective of the test result that's actually doing the timing.

Changed in testtools:
assignee: nobody → Martin [gz] (gz)
status: New → Confirmed
Revision history for this message
Martin Packman (gz) wrote :

However bzr also needs to start deferring to testtools for the timings in order to pick up the funky way subunit timestamps are handled.

Changed in bzr:
assignee: nobody → Martin [gz] (gz)
importance: Undecided → Medium
Vincent Ladeuil (vila)
Changed in bzr:
status: Confirmed → In Progress
Changed in testtools:
status: Confirmed → In Progress
Revision history for this message
Martin Packman (gz) wrote :

Change needed for bzr has landed, and babune is using the testtools branch with the fix for now so it showing timings:
<http://babune.ladeuil.net:24842/job/selftest-lucid/230/testReport/>

Changed in bzr:
milestone: none → 2.3b2
status: In Progress → Fix Released
Martin Packman (gz)
Changed in testtools:
status: In Progress → Fix Committed
Jonathan Lange (jml)
Changed in testtools:
milestone: none → 0.9.7
Jonathan Lange (jml)
Changed in testtools:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.