race conditions in cron

Bug #891170 reported by Andrew Nicols
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
Medium
Richard Mansfield

Bug Description

Thanks to tim hunt at the OU for pointing the first issue out:

It's possible for cron_lock to fail to get the $started value if the row has been deleted between attempting to insert the record, and attempting to retrieve the start date
http://gitorious.org/mahara/mahara/blobs/master/htdocs/lib/cron.php#line488

Additionally, when we try to restart a job which has been running for > 24 hours, we should probably call cron_lock() rather than insert_record()
http://gitorious.org/mahara/mahara/blobs/master/htdocs/lib/cron.php#line498

That said, we're very unlikely to ever hit this given the use of cron.

Tags: cron
Revision history for this message
François Marier (fmarier) wrote :

Thanks for that Andrew and Tim.

I reckon we should target this for 1.5 just because it's the kind of bug that's really painful to debug when it happens in the wild.

Changed in mahara:
status: New → Triaged
milestone: none → 1.5.0
importance: Low → Medium
Revision history for this message
Richard Mansfield (richard-mansfield) wrote :

Andrew, can you have a look at https://reviews.mahara.org/#change,881 - I think it's enough to just remove the stale lock and exit there instead of trying to restart the function.

Changed in mahara:
status: Triaged → In Progress
assignee: nobody → Richard Mansfield (richard-mansfield)
Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/881
Committed: http://gitorious.org/mahara/mahara/commit/93639aacf1423a9eabdaf06216912160a8f07494
Submitter: Andrew Robert Nicols (<email address hidden>)
Branch: master

commit 93639aacf1423a9eabdaf06216912160a8f07494
Author: Richard Mansfield <email address hidden>
Date: Thu Nov 24 16:41:59 2011 +1300

    Fix race condition in cron (bug #891170)

    When cron finds a function with a stale (day-old) lock, just remove
    the lock and give up rather than trying to execute the function again.
    It's already pretty late, and waiting another minute for the next cron
    to start won't make a difference.

    This should avoid the case where two copies of cron both find the stale
    lock at the same time, and restart simultaneously. It won't matter if
    cron_lock fails to get a $started value because the row has been
    deleted: the next instance of cron to run will be able to insert the
    row.

    Change-Id: I025aaf89d64f47466f1ba4c5bb8178317277ec2c
    Signed-off-by: Richard Mansfield <email address hidden>

Changed in mahara:
status: In Progress → Fix Committed
Melissa Draper (melissa)
Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.