Init script causing unclean DB shutdown

Bug #92139 reported by Mark McDonald
2
Affects Status Importance Assigned to Milestone
openldap2.3 (Ubuntu)
Fix Released
Low
Unassigned

Bug Description

slapd 2.3.30-2

To shutdown slapd the init script (/etc/init.d/slapd) calls start-stop-daemon with "--timeout 10". This effectively sends the slapd process a SIGTERM, waits 10 seconds and then sends a terminal SIGKILL, forcibly killing the process.

The problem is that under certain configurations (my configuration used back-hdb and a high checkpoint interval of 60 minutes), slapd takes longer than 10 seconds to cleanly shutdown. Terminating with SIGKILL while flushing the data to disk causes massive data corruption in the backend, in my case the next startup caused searches for specific DNs to return completely different entries as well as duplicating multiple entries across the same DN. Examples of the database corruption can be provided but I don't feel they are necessary for this bug report.

This can be worked around by using more frequent checkpoint intervals and/or faster disk hardware but it would obviously be better to provide a safer shutdown mechanism. I have provided the following patch which changes the behaviour of start-stop-daemon to send the SIGTERM, wait 10 seconds and then exit with a '2' errorlevel, although the default --quiet suppresses any error messages despite the scripts intentions.

--- openldap2.3-2.3.30/debian/slapd.init 2007-03-14 13:19:14.757418000 +0900
+++ openldap2.3-2.3.30/debian/slapd.init.new 2007-03-14 14:11:15.349299000 +0900
@@ -155,7 +155,7 @@
 # $reason.
 stop_slapd() {
        echo -n " slapd"
- reason="`start-stop-daemon --stop --quiet --oknodo --retry 10 \
+ reason="`start-stop-daemon --stop --quiet --oknodo --retry TERM/10 \
                --pidfile "$SLAPD_PIDFILE" \
                --exec /usr/sbin/slapd 2>&1`"
 }

If I have omitted any detail please respond and I shall provide whatever is needed

Revision history for this message
Mark McDonald (mmcdonald-staff) wrote :

Attaching the patch as a file

Mathias Gug (mathiaz)
Changed in openldap2.3:
importance: Undecided → Low
status: New → Triaged
Steve Langasek (vorlon)
Changed in openldap2.3:
status: Triaged → Fix Committed
Revision history for this message
Steve Langasek (vorlon) wrote :

This bug has been fixed in hardy. Changelog:

 openldap2.3 (2.4.7-5) unstable; urgency=low
 .
   [ Updated debconf translations ]
   * Finnish, thanks to Esko Arajärvi <email address hidden>. Closes: #462688.
   * Galician, thanks to Jacobo Tarrio <email address hidden>. Closes: #462987.
   * French, thanks to Christian Perrier <email address hidden>.
     Closes: #463149.
   * Russian, thanks to Yuri Kozlov <email address hidden>. Closes: #463442.
   * Czech, thanks to Miroslav Kure <email address hidden>. Closes: #463472.
   * German, thanks to Helge Kreutzmann <email address hidden>.
     Closes: #464718.
 .
   [ Steve Langasek ]
   * Fix various regressions related to the introduction of GnuTLS:
     - Add new patch, gnutls-ciphers, to fix support for specifying multiple
       ciphers with TLSCipherSuite option in slapd.conf. Thanks to Kyle
       Moffett <email address hidden> for the patch. Closes LP: #188200.
     - Add new patch, slapd-tlsverifyclient-default, to set the intended
       default value of "TLSVerifyClient never" in the right place.
     - Add new patch, gnutls-altname-nulterminated, to account for differences
       in how the "length" is returned for commonName vs. subjectAltName.
     - Comment out TLSCipherSuite settings on upgrade from all versions prior
       to 2.4.7-5, and throw a debconf error to the user notifying them of
       this, since all OpenSSL cipher suite values are incompatible with
       GnuTLS.
     Closes: #462588.
   * Add new patch from upstream, entryCSN-backwards-compatibility, to support
     auto-converting entryCSN attributes in a previously supported old format,
     fixing an upgrade failure. Closes: #462099.
   * Use --retry TERM/10 instead of --retry 10 when stopping slapd, since the
     latter resorts to a SIGKILL and may corrupt backend data; whereas the
     former will exit non-zero if slapd is still running but won't directly
     cause data-loss. Thanks to Mark McDonald for the patch. LP: #92139.
   * Fix manpage symlinks in libldap2-dev; thanks to Reuben Thomas for
     reporting. Closes: #463971.
   * Fix a superfluous space in the debconf templates, due to a trailing space
     in the templates. Closes: #464719.

Changed in openldap2.3:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.