Make daemons spawned by test suite exit more reliably

Bug #37837 reported by James Henstridge
4
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
Medium
Unassigned

Bug Description

A recurring problem with the test suite is that a twisted daemon does not exit correctly at the end of the test suite run.

I have a branch almost ready that makes the daemons run in the same process group as the tests, which allows PQM to kill the daemons with the rest of the test suite if it detects a test suite hang.

It is still possible for bugs to cause the daemons to remain running past the end of the suite though. The following is an idea for how to make things a little more robust:

 1. create a pipe where the read end is opened by the daemon, and the write end is opened by the test suite. It may be necessary to pass the file descriptor number via the environment rather than command line arguments. It is important that the daemon does not keep the write end open.

 2. the daemon listens on that file descriptor (Andrew suggested using twisted.internet.stdio). When it detects that the write end has been closed, the daemon should exit gracefully.

 3. TacTestSetup.tearDown() is modified to close the write end of the pipe and wait for the daemon to exit, only doing os.kill() if it doesn't exit by itself.

In the case that the test suite is killed or exits, the OS will close all of its file descriptors automatically. This will cause the read ends of those pipes to hang up, telling those daemons to exit.

Changed in launchpad:
assignee: nobody → spiv
status: Unconfirmed → Confirmed
Changed in launchpad:
assignee: spiv → nobody
Revision history for this message
Jonathan Lange (jml) wrote :

Maybe fix this at the same time as bug 1307.

tags: added: build-infrastructure
removed: infrastructure test-system
Revision history for this message
Robert Collins (lifeless) wrote :

This might still be good to do but i found that the main reason we leaked things was depending on atexit (rather than cleaning up as needed) - which let us run two daemons at once (because zope.testing reinvoked) and then the pid for the first daemon was gone, so it could n't be read back and killed when the first process actually finished.

This is now fixed, so the complex approach described in this patch shouldn't be needed.

Changed in launchpad-foundations:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.