start: Job is already running: anacron

Bug #606491 reported by François Marier
372
This bug affects 74 people
Affects Status Importance Assigned to Milestone
anacron (Ubuntu)
Triaged
Medium
Canonical Foundations Team
Nominated for Lucid by Alex
apt (Ubuntu)
Confirmed
Undecided
Unassigned
Nominated for Lucid by Alex

Bug Description

Binary package hint: anacron

Every day cron sends me this email:

  Date: Sat, 17 Jul 2010 07:30:01 +1200
  From: Cron Daemon <root@hostname>
  To: root@hostname
  Subject: Cron <root@hostname> start -q anacron || :

  start: Job is already running: anacron

I've tried to find out why it was running twice, but I could only find one copy of anacron in the cron directories...

Tags: patch
Revision history for this message
click (daniel-zhelev) wrote :

Just to add some info since original is a very plain. So according my theory something halts the scripts and the weekly and daily crons match which explains at least why I`m getting this only on weekly basis. Output

root@wolfdale:~# uname -a
Linux wolfdale 2.6.32-24-server #42-Ubuntu SMP Fri Aug 20 15:38:55 UTC 2010 x86_64 GNU/Linux
root@wolfdale:~# date
Sun Sep 12 22:22:48 EEST 2010
root@wolfdale:~# cat /etc/crontab
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow user command
17 * * * * root cd / && run-parts --report /etc/cron.hourly
25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
#

root@wolfdale:~# test -x /usr/sbin/anacron || ( cd / && time run-parts --report /etc/cron.daily )

##### MEANWHILE ######
root 19549 14357 0 22:25 pts/0 00:00:00 bash
root 19550 19549 0 22:25 pts/0 00:00:00 run-parts --report /etc/cron.daily
root 21795 19550 0 22:27 pts/0 00:00:00 /bin/sh /etc/cron.daily/apt
root 21816 21795 0 22:27 pts/0 00:00:00 sleep 1458

/etc/cron.daily/apt dosen`t contain any sleep 1458 command so it is coming from somewhere else, unfortunately I`m unable
to track it since I`m not very aware of this freak of nature called anacron. Could someone explain if it has debug mode or something? Suggested workaround is to move weekly cron with 1h. to say -

47 7 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )

and monthly with another

52 8 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )

This isn`t tested yet, but simple math shows that if daily cron halts for 24,3 minutes the weekly cron which is 22 min after daily
will collide with daily. I`ll try that tonight or tomorrow.

Revision history for this message
click (daniel-zhelev) wrote :

Just to add, tried to collide them manually - no luck

root@wolfdale:~# test -x /usr/sbin/anacron || ( cd / && time run-parts --report /etc/cron.daily )
real 30m35.078s
user 4m44.720s
sys 0m49.480s

root@wolfdale:~# test -x /usr/sbin/anacron || ( cd / && time run-parts -v --report /etc/cron.daily )
run-parts: executing /etc/cron.daily/00logwatch
run-parts: executing /etc/cron.daily/5snort
run-parts: executing /etc/cron.daily/aide
run-parts: executing /etc/cron.daily/apache2
run-parts: executing /etc/cron.daily/apport
run-parts: executing /etc/cron.daily/apt

Here the script halts for an decay

Another info

root@wolfdale:/# cat /etc/cron.daily/* | grep 1458

Funny :)

Revision history for this message
Frank Schubert (f-schubert) wrote :

Hi click,

that *sleep* you see is from /etc/cron.daily/apt, I think:

# sleep for a random interval of time (default 30min)
# (some code taken from cron-apt, thanks)
random_sleep()
{
# [...]
    sleep $TIME
}
EOF

When I get these mails there is an anacron process (ps aux |grep anacron) that is waiting:
*pstree* shows
     |-anacron---sh---run-parts---apt

So apt ran from the cronjob is the problem, *not* anacron:
root 23642 0.0 0.0 0 0 ? ZN Sep15 0:00 [apt] <defunct>

process state codes:
      Z Defunct ("zombie") process, terminated but not reaped by its parent.
      N low-priority (nice to other users)

I could not kill that zombie-process directly, but if after killing all processes shown in that pstree-output that zombie went away.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

I've also just received a similar email from cron.

    Return-Path: <email address hidden>
    Delivery-Date: Tue Mar 29 07:30:01 2011
    Return-Path: <email address hidden>
    X-Original-To: root
    Delivered-To: <email address hidden>
    Received: by orac.inputplus.co.uk (Postfix, from userid 0)
            id 2BCF1327A6; Tue, 29 Mar 2011 07:30:01 +0100 (BST)
    From: <email address hidden> (Cron Daemon)
    To: <email address hidden>
    Subject: Cron <root@orac> start -q anacron || :
    Content-Type: text/plain; charset=ANSI_X3.4-1968
    X-Cron-Env: <SHELL=/bin/sh>
    X-Cron-Env: <PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin>
    X-Cron-Env: <HOME=/root>
    X-Cron-Env: <LOGNAME=root>
    Message-Id: <email address hidden>
    Date: Tue, 29 Mar 2011 07:30:01 +0100 (BST)

    start: Job is already running: anacron

Note how my local time is 07:30:01, exactly the same as François
Marier's email when he opened the bug.

BTW, to the 24 people that said this bug was also affecting them, it
helps to change the status from New to Confirmed. :-) Until a bug
moves off of New there's often no work done on fixing it since it has
just been reported by one person.

Changed in anacron (Ubuntu):
status: New → Confirmed
Revision history for this message
Chris Osgood (ubuntu-functionalfuture) wrote :

Been seeing the same problem in Maverick lately. I believe it may have something to do with automatic upgrades. Do the other people experiencing this issue have automatic updates turned on?

When manually running apt-get update/upgrade I have seen it hang indefinitely before possibly due to network issues and this may be what is happening when the system does the automatic updates. Maybe apt needs better network error handling or something (just a guess without actually looking into this problem very closely).

Revision history for this message
François Marier (fmarier) wrote :

I also have automatic security updates turned on.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

Just had another occurrence. Again, local time is 07:30:01.

    Return-Path: <email address hidden>
    Delivery-Date: Sat Apr 9 07:30:02 2011
    Return-Path: <email address hidden>
    X-Original-To: root
    Delivered-To: <email address hidden>
    Received: by orac.inputplus.co.uk (Postfix, from userid 0)
            id 0606532DE2; Sat, 9 Apr 2011 07:30:01 +0100 (BST)
    From: <email address hidden> (Cron Daemon)
    To: <email address hidden>
    Subject: Cron <root@orac> start -q anacron || :
    Content-Type: text/plain; charset=ANSI_X3.4-1968
    X-Cron-Env: <SHELL=/bin/sh>
    X-Cron-Env: <PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin>
    X-Cron-Env: <HOME=/root>
    X-Cron-Env: <LOGNAME=root>
    Message-Id: <email address hidden>
    Date: Sat, 9 Apr 2011 07:30:01 +0100 (BST)

    start: Job is already running: anacron

And it's caused by cron running /etc/cron.d/anacron.

    # /etc/cron.d/anacron: crontab entries for the anacron package

    SHELL=/bin/sh
    PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

    #30 7 * * * root test -x /etc/init.d/anacron && /usr/sbin/invoke-rc.d anacron start >/dev/null
    30 7 * * * root start -q anacron || :

Could it be that in the past `start -q' did not generate output with -q
if anacron was already running?

Also, has anyone else with this problem increased the delay in
/etc/anacrontab so things don't run so soon after boot-up? The delays
are normally 5, 10, and 15, but mine are now

    # /etc/anacrontab: configuration file for anacron

    # See anacron(8) and anacrontab(5) for details.

    SHELL=/bin/sh
    PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

    # These replace cron's entries
    1 125 cron.daily nice run-parts --report /etc/cron.daily
    7 130 cron.weekly nice run-parts --report /etc/cron.weekly
    @monthly 135 cron.monthly nice run-parts --report /etc/cron.monthly

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

I think I get this email if I boot within roughly a couple of hours (due to my longer /etc/anacrontab delays) of 07:30 local time. Perhaps an anacron from boot time is sleeping for those delays and meanwhile, at 07:30, cron attempts to `start -q anacron', causing noise and the email. Others could test this by either booting shortly before 07:30 or, perhaps more easily :-), altering the 07:30 to now plus five minutes and re-booting.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

Got another email this morning. Here's the relevant lines from
/var/log/syslog showing the overlap between the first anacron and the
second started by cron.

    Apr 16 07:29:40 orac cron[1097]: (CRON) INFO (pidfile fd = 3)
    Apr 16 07:29:40 orac anacron[1111]: Anacron 2.3 started on 2011-04-16
    Apr 16 07:29:40 orac cron[1159]: (CRON) STARTUP (fork ok)
    Apr 16 07:29:40 orac anacron[1111]: Will run job `cron.daily' in 125 min.
    Apr 16 07:29:40 orac anacron[1111]: Will run job `cron.weekly' in 130 min.
    Apr 16 07:29:40 orac anacron[1111]: Jobs will be executed sequentially
    Apr 16 07:29:41 orac cron[1159]: (CRON) INFO (Running @reboot jobs)
    Apr 16 07:30:01 orac CRON[1554]: (root) CMD (start -q anacron || :)
    Apr 16 08:17:01 orac CRON[2917]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Apr 16 09:17:01 orac CRON[6188]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
    Apr 16 09:34:40 orac anacron[1111]: Job `cron.daily' started
    Apr 16 09:34:40 orac anacron[7716]: Updated timestamp for job `cron.daily' to 2011-04-16

My longer initial delays for anacron show up the issue more by
increasing the window that it can occur in, but it's there with the
default delays too.

Revision history for this message
Chris Osgood (ubuntu-functionalfuture) wrote :

Had this happen again this morning and I checked to see if apt was stuck. Sure enough:

root 947 0.0 0.0 0 0 ? ZN Apr26 0:00 [apt] <defunct>

That process can not be killed either, not even with -9.

At least in my case I'm pretty sure this has to do with the automatic updates. For some reason apt gets wedged occasionally. Still running 10.10.

Revision history for this message
tdn (spam-thomasdamgaard) wrote :

I get this same error occasionally. I just tried running:
test -x /usr/sbin/anacron || ( cd / && time run-parts -v --report /etc/cron.daily )
as root.
It terminated with returncode 0 and without any output.

Revision history for this message
mcguire (jonathand131-gmail) wrote :

Hi everybody,

I'm also affected by this bug.

I confirm in my case that the message is produced by the command found in /etc/cron.d/anacron :
in this file, there is a rule to start the anacron service everyday at 7:30.

I don't understand what it is for since this service keeps running in the background so it doesn't need to be started and started again and I see nothing stopping it.

Revision history for this message
Alden (jason-alden-benoit) wrote :

I don't have the bug anymore, but I have reinstalled and am now running Trisquel, which is also Ubuntu based. I think the issue first happened to me after I installed an email server, which of course you must have I guess. But I think it could be related to that, although I am far from an expert. Out of curiosity, has this been the same for you all as well, with the issue not happening until you installed an email server? Could it be doing something or adding to anacron?

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

Alden, no, it's nothing to do with an email server as such, although that may have made it more likely to happen on your system but then an FTP server or web server may have done so too. My comments #9 and #10 above try to explain why it's happening.

Revision history for this message
Sven Goldt (goldt) wrote :

i had this problem for one week now also and found the reason - the apt server in /etc/apt/sources.list was no longer working and "apt update" hung forever then.
Just find a new source server and it should work but on the other hand apt should not block forever which is a bug from "apt" and not "anacron".

Revision history for this message
Simon Oosthoek (simon-margo) wrote :

I got this too, and after reading the comments, I think it's both (ana)cron and apt doing something wrong.

This may also be a security bug, since apt will keep a lock and automatic updates (which is configured on my system) may not be running due to a stuck apt process. The apt process is really dead, but the run-parts daily process is stuck. killing that took care of the stuck apt child process.

Revision history for this message
Johannes Martin (jmartin-notamusica) wrote :

In my case, the problem happens because /etc/cron.daily/apt starts apt-key net-update, which ignores the apt proxy settings.

See https://bugs.launchpad.net/ubuntu/+source/apt/+bug/226780

Revision history for this message
Daniel Richard G. (skunk) wrote :

The following change should address this bug:

--- /etc/cron.d/anacron.orig 2010-06-20 04:11:29.000000000 -0400
+++ /etc/cron.d/anacron 2012-01-06 18:03:48.000000000 -0500
@@ -4,4 +4,4 @@
 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

 #30 7 * * * root test -x /etc/init.d/anacron && /usr/sbin/invoke-rc.d anacron start >/dev/null
-30 7 * * * root start -q anacron || :
+30 7 * * * root start --quiet anacron 2>/dev/null || :

The cron-job command is written to disregard failure of the start(8) command, but it was not absorbing the error message that would be produced in such an instance.

(I also changed -q to --quiet, as the former is not documented in the initctl(8) man page.)

Of course, this does nothing for the apt bug that appears to be associated with this one.

Revision history for this message
Johannes Martin (jmartin-notamusica) wrote :

This patch will only cure the symptom (the email being sent), but not the real problem (anacron hanging). I don't think that's what we want.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Anacron is hanging only because apt is hanging, however---that's a bug in the latter, not the former. All anacron is doing wrong is causing the e-mail to be sent.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

Daniel, I disagree. If you read back through all the bug's comments you'll see that some of us are getting this email and it has nothing to do with apt, that's just one instance that can trigger overlapping anacrons. The solution isn't to discard the error messages. :-)

Revision history for this message
Daniel Richard G. (skunk) wrote :

The logic is already written with the intent of failing silently if anacron is already running; that's what the "|| :" bit is for. Anacron can be started from multiple places (various cron entries and pm-utils), so it has to address the possibility of collisions from two instances starting at roughly the same time. (I'm only getting the e-mail infrequently, so this is what is probably happening in my case.)

If you find the e-mail messages useful for diagnosing hung jobs, then you'll need to file a separate feature request. For one, anacron would need better logic to determine when collisions are a result of those rather than two instances starting in close succession.

Revision history for this message
Mark Fraser (launchpad-mfraz) wrote :

I only ever receive an email if I turn the computer on before 7.30AM.

Revision history for this message
latimerio (fomember) wrote :

I don't see why there is so much discussion about a workaround.
In my view if there is a flag -q or --quiet there should be no need to redirect output to /dev/null but anacron should really be quiet. Point!
So I think it is a bug if anacron -q exhibits a message about not starting a second instance, which even is not really an error but more a warning.
And if a --quiet option makes any sense at all it should be quiet about warnings.

Revision history for this message
Daniel Richard G. (skunk) wrote :

That's a point, but AFAICS --quiet means that start(8) doesn't print the "blah start/running" message that it would normally give. Error messages are unaffected.

Revision history for this message
Johannes Martin (jmartin-notamusica) wrote :

The description of the --quiet option says: "Reduces output of all commands to errors only."

Anacron not starting a second instance because another anacron is already running IS an error, as we expect the start command to successfully start anacron. We expect anacron to do the periodic jobs reliably. And if some job is hung (as in the case of apt-key in my case) we need some notification about the error.

I see a couple of solutions:
- Add an option to anacron that makes it check when it was last run and only run it again if a configuration amount of time has passed since the last run. If the time has not passed, silently quit.
- Make anacron execute all jobs in background and quit once all jobs have been started. Would need some helper process to record and report output of the jobs via mail.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Johannes, --quiet is only applicable to start(8), not to anacron. The solutions you're proposing are feature requests; please post a separate bug report for those.

I agree that having some way of being notified about hung jobs would be nice, but that's beyond the scope of this bug report.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote : Re: [Bug 606491] Re: start: Job is already running: anacron

To discard all errors with redirection seems too severe. As I wondered
above, has start's behaviour changed WRT reporting foo is already
running when it's been given --quiet? Perhaps what's needed is a start
option that tells it to consider an already running foo to be OK.

Revision history for this message
Johannes Martin (jmartin-notamusica) wrote :

Daniel: I'm perfectly happy with the situation as it is now. I got an email notifying me about a problem which eventually pointed me to the cause of the problem. If we changed anything in the way start or anacron respond by default to duplicate instances of processes, we may break other things and make things worse.

As to my proposed solution alternatives: I don't really see the sense of creating separate bug reports for those. They should be discussed to see which one makes the most sense in the context of this bug report. Once the discussion is complete, then we might create a feature request.

Revision history for this message
Ralph Corderoy (ralph-inputplus) wrote :

Johannes, I agree some of the emails point to an underlying problem that
needs fixing, but some are due to permissible changes, e.g. lengthening
the amount of time after boot before /etc/cron.daily, etc., are kicked
off; see /etc/anacrontab at the end of comment #7. To get a daily
email because of that seems wrong.

Revision history for this message
Malcolm Scott (malcscott) wrote :

This bug is more serious than I thought. It appears that because of the aforementioned apt bug (it sometimes never exits when invoked from one of the cron.daily scripts), anacron has not run any jobs on my system for the past month.

This is not a good failure mode. It would be much more safe for anacron to be allowed to start on subsequent days (perhaps killing it if it is still going 24 hours later) to minimise the number of jobs missed. For example, this seems to be an improvement:

--- cron.d_anacron.orig 2012-03-01 01:31:33.627985854 +0000
+++ /etc/cron.d/anacron 2012-03-01 01:31:42.903921823 +0000
@@ -4,4 +4,4 @@
 PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

 #30 7 * * * root test -x /etc/init.d/anacron && /usr/sbin/invoke-rc.d anacron start >/dev/null
-30 7 * * * root start -q anacron || :
+30 7 * * * root start -q anacron || restart -q anacron || :

That doesn't prevent the "job is already running" email being sent out after a job has frozen, but it will be sent once per stuck job and then will allow anacron to recover, rather than being sent daily until the stuck job is manually killed.

Of course, the underlying apt bug should be fixed too, but this will allow anacron to recover from this and other bugs.

Changed in anacron (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → Medium
Steve Langasek (vorlon)
Changed in anacron (Ubuntu):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Revision history for this message
Anil Gangolli (anil-busybuddha) wrote :

Since this can affect automatic security update checks when it strikes, it is more than just a nuisance.

Revision history for this message
Steffen Röcker (sroecker) wrote :

I get these emails from precise machines, most of the times directly after the installation.
Seems that upstart has trouble sending to dbus and 'start anacron' hangs forever, see attached strace.

Revision history for this message
Kevin McCormick (kmccormick) wrote :

It seems like I may be in the minority here, but I am not experiencing any problems with APT or other things hanging.

My problem is simply that I boot my machine slightly before 7:30 local time, and anacron is still running as expected when cron tries to start it. I think the proper way to fix this is to have cron not directly start anacron, but rather emit an event. That way upstart automatically handles starting anacron if it is not already running, and ignores the event if it is. I've attached a patch to do so.

Revision history for this message
Kevin McCormick (kmccormick) wrote :

This patch hits the needs of people having problems with anacron hanging.

For the issues with hanging, and causing a notification if anacron is still running long after it should be, I think a separate process is in order. I could not find any way within upstart to do this, so I wrote a short script to launch from cron daily and squawk if anacron has been running >12 hours.

I've tried to make my script as portable as possible, but I'm not sure how portable my ps(1) and date(1) syntax is, or on how many platforms anacron is used.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Fix only the issue of warnings from cron when anacron is already running" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Stéphane Gourichon (stephane-gourichon-lpad) wrote :
Download full text (4.5 KiB)

It's June 2014 and this bug still hits the latest LTS release Trusty Thar.

# Why we are here

Some people say "fix the noise" and others say "it's not noise".
What's needed is to fix the noise and keep the signal.

Has anyone used Kevin's patches (see comments #34 and #35) ?

My guess is: many people have made local changes to their systems and not bother afterwards, making the bug still occur in latest LTS release.

I'm patching my system now but it's important to fix this.

# Why this is important

Reason 1: Noise in mailboxes is a more serious issue than it appears. It's a fact that each uninformative message received from a system makes more probable that important messages will get missed.

Reason 2: The current situation makes the same message in normal operation (anacron instances overlap) and in dangerous situation (security updates not applied for extended periods of time, see comment #31).

Please raise this bug's "importance".

To have the bug fixed once and for all, let's clear up the situation.

# Why this is a bug in anacron and not only apt or whatever.

Normal operations:

* (1) Long job duration in normal operation.
* (2) Asynchronous uncontrollable job start times making overlaps happen in normal operation (e.g. daily).
* (3) Overlaps are reported by e-mail.

Any issue arising from (1), (2) or (3) only is a bug in anacron.

Abnormal operations:

* (4) Some jobs get stuck forever (abnormal operation).
* (5) Stuck jobs prevent other jobs for a possibly unlimited time.

Any issue arising from (4) or (5) may be anacron bugs or wishes to make anacron more robust, just like we generally expect our system to robustly stop buggy programs without crashing the computer.

## Normal operations

(1) some jobs are designed to wait for a long time (up to half an hour), even when everything is fine, form example /etc/cron.daily/apt . But some people don't see it because their config. My fresh 14.04 Trusty has package update-notifier-common installed which seems sufficient to trigger a sleep up to half an hour on that job invocation every time. This is normal operation.
(2) anacron is setup to be run on several occasions to minimize delay. For example, on top of running it daily at 07:30, it also runs at boot and on resume from suspend. Nothing prevents booting/resume minutes before 07:30, a delay shorter than normal jobs.
(3) (nothing more to say)

(1)+(2) makes overlaps part of normal operation.
(1)+(2)+(3) makes *noise* in mailbox about overlaps.

(1)+(2)+(3) makes this bug an *actual anacron bug*.

## Abnormal operations

(4) some jobs get broken and get stuck forever. This should not happen on plain human beings' machines. Sysadmin caring a little wish to be notified. Heavyweight sysadmins already have other ways to get notified and/or get jobs killed automatically.
(5) Prevented jobs may include security updates, which make it a serious issue (see comment #31).

(1)+(3)+(4) makes *signal* in mailbox about stuck jobs, but which looks like noise
(5) prevented jobs make the whole issue serious.

# What to do ?

Now we know where's the anacron bug and where's the feature wish.

## Raise bug importance

Noise in mailboxes is a more s...

Read more...

Revision history for this message
Steve Langasek (vorlon) wrote :

gouri,

On Thu, Jun 05, 2014 at 09:02:28AM -0000, gouri wrote:
> It's June 2014 and this bug still hits the latest LTS release Trusty
> Thar.

Under what circumstances are you seeing this problem? Is it a regular occurrence for you due to something like comment #34 (booting just before 7:30 every day), or an issue with a previous instance of anacron being hung?

I don't think that we want to "solve" this by suppressing the mails from anacron. anacron itself should be made more resilient to hung jobs, and we should fix the bugs currently causing the apt cron job to hang. If we address both of these points, the symptom of receiving mails about already-running anacron jobs should solve itself.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Steve,

If anacron sending out e-mail under such circumstances is a desired behavior, could you at least make the message more intelligible, with a mention of the likelihood that a previously-started cron job has hung?

Even nicer would be some shell magic that greps the process table for children of the existing anacron process, so that the message can actually name what specific job is at fault---and users can file bugs against the appropriate package(s) instead of coming here.

Revision history for this message
Steve Langasek (vorlon) wrote :

On Tue, Jun 10, 2014 at 02:42:23AM -0000, Daniel Richard G. wrote:
> If anacron sending out e-mail under such circumstances is a desired
> behavior, could you at least make the message more intelligible, with a
> mention of the likelihood that a previously-started cron job has hung?

> Even nicer would be some shell magic that greps the process table for
> children of the existing anacron process, so that the message can
> actually name what specific job is at fault---and users can file bugs
> against the appropriate package(s) instead of coming here.

I wouldn't block someone from making such a change, but again, this is
fixing secondary symptoms instead of the real bug. anacron should not
block indefinitely and fail to run again due to misbehaving cron jobs; and
the misbehaving cronjob that we know about should be fixed so that it
doesn't misbehave. Making the error mails clearer is, to put it bluntly,
polishing a turd.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Steve,

Anacron sending out a cryptic e-mail due to a hung cron job is a primary bug in and of itself. Either don't send out an e-mail at all, or send an e-mail that doesn't confuse people while leaving them no avenue for follow-up. You have enough deep knowledge of the system to intuit the underlying problem from that message. Most users don't---and to put it bluntly, you're not Ubuntu's target audience.

I agree that the cron job should be fixed so that it doesn't misbehave. But since that's not going to happen overnight, and hung cron jobs are a thing, and anacron is part of a default Ubuntu install, and Ubuntu purports to be user-friendly, we need a fix here as well.

Revision history for this message
Stéphane Gourichon (stephane-gourichon-lpad) wrote :

@Steve (comment #38):

> Under what circumstances are you seeing this problem? Is it a regular occurrence for you due to something like comment #34 (booting just before 7:30 every day), or an issue with a previous instance of anacron being hung?

It's all written in my comment #37.

I had it every time I booted or resumed my system before 07:30.

I don't have any hung cron jobs, plus this is a newly installed system, no hacks or random rogue cron jobs. I do have a long-"running" (actually waiting) cron job which is just Ubuntu's standard apt job waiting randomly for up to half an hour to level out mirrors' bandwidth. Nothing special here.

After using Kevin's patch the noise in mailbox is solved.

@Daniel and @Steve,

Reading what you write, I think we all agree and are just saying the same things with our own words until they are clear enough for the bug to be fixed.

Please read my comment #37 thoroughly from beginning to end and tell me what you think. Here are the sections names:

# Why we are here
# Why this is important
# Why this is a bug in anacron and not only apt or whatever.
## Normal operations
## Abnormal operations
# What to do ?
## Raise bug importance
## Fix anacron bug: disable noise mail that report overlap because it's really noise.
## Grant anacron's wish: get mail about *stuck jobs* (not overlap) because that's what sysadmin really need.
## Grant another anacron's wish: be more resilient to stuck jobs

Perhaps we should actually make different launchpad entries for the actual bug and for the two wishes.
That way, every entry(bug/wish) will have a clear focus and not call for discussions like "don't suppress important mail" vs. "but this is just noise".

What do you think ?

Revision history for this message
Stéphane Gourichon (stephane-gourichon-lpad) wrote :

I said I observed the bug on 14.04, also happens on 12.04.
By "the bug" I mean : receive the e-mail just because you start or resume the system before 07:30.

Can someone formally mark that the bug also happens on "Precise" and "Trusty" ?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt (Ubuntu):
status: New → Confirmed
Revision history for this message
Randy Skretka (rskret) wrote :

This is a long living bug and it is quite disruptive. It is affecting me on..
$ . /etc/os-release
$ echo $VERSION
14.04, Trusty Tahr

Revision history for this message
gaara (yoggic) wrote :

This bug affect me, Kubuntu 14.04 x64.
I've wake up my machine at 7:30, after suspend.

de "root" <email address hidden>
à root@unspecified-domain
date 15/08/14 07:30
objet Cron <root@my-host> start -q anacron || :
start: Job is already running: anacron

Revision history for this message
Stefan Pappalardo (sjuk) wrote :

It affects me too (mythbuntu thrusty x64).

Description: Ubuntu 14.04.2 LTS
Release: 14.04

anacron:
  Installiert: 2.3-20ubuntu1
  Installationskandidat: 2.3-20ubuntu1
  Versionstabelle:
 *** 2.3-20ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status

subject: Cron <root@obelix> start -q anacron || :
body: start: Job is already running: anacron

Revision history for this message
Paul Tomblin (ptomblin) wrote :

I got the email last time I rebooted:

Delivered-To: <email address hidden>
Received: by 10.70.54.225 with SMTP id m1csp509715pdp;
        Thu, 21 May 2015 04:30:03 -0700 (PDT)
X-Received: by 10.140.19.108 with SMTP id 99mr2819447qgg.56.1432207802871;
        Thu, 21 May 2015 04:30:02 -0700 (PDT)
Return-Path: <email address hidden>
Received: from linode.xcski.com (linode.xcski.com. [69.164.214.240])
        by mx.google.com with ESMTP id z9si20613180qcn.27.2015.05.21.04.30.02
        for <email address hidden>;
        Thu, 21 May 2015 04:30:02 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of <email address hidden> designates 69.164.214.240 as permitted sender) client-ip=69.164.214.240;
Authentication-Results: mx.google.com;
       spf=pass (google.com: best guess record for domain of <email address hidden> designates 69.164.214.240 as permitted sender) <email address hidden>
Received: from allhats2.xcski.com (localhost [127.0.0.1])
 by linode.xcski.com (Postfix) with ESMTP id 2030257A002
 for <email address hidden>; Thu, 21 May 2015 07:30:02 -0400 (EDT)
Received: by allhats2.xcski.com (Golgafrincham B Ark, from userid 1000)
 id E71D12588; Thu, 21 May 2015 07:30:01 -0400 (EDT)
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on allhats2.xcski.com
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RELAYS
 autolearn=ham autolearn_force=no version=3.4.0
X-Original-To: root
Delivered-To: <email address hidden>
Received: by allhats2.xcski.com (Golgafrincham B Ark, from userid 0)
 id BD1072588; Thu, 21 May 2015 07:30:01 -0400 (EDT)
From: <email address hidden> (Cron Daemon)
To: <email address hidden>
Subject: Cron <root@allhats2> start -q anacron || :
Content-Type: text/plain; charset=ANSI_X3.4-1968
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin>
X-Cron-Env: <HOME=/root>
X-Cron-Env: <LOGNAME=root>
Message-Id: <email address hidden>
Date: Thu, 21 May 2015 07:30:01 -0400 (EDT)

start: Job is already running: anacron

As well, anancron doesn't appear to be running:

root@allhats2:/etc/init# status anacron
anacron stop/waiting
root@allhats2:/etc/init# ps auwwx |grep anacron
root 21700 0.0 0.0 11740 916 pts/20 S+ 08:31 0:00 grep anacron
root@allhats2:/etc/init# start anacron
anacron stop/waiting

Should it be?

Revision history for this message
Stuart Rackham (srackham) wrote :

The default /etc/anacrontab only executes cron jobs, doesn't this make /etc/cron.d/anacron unnecessary?

Boot time execution of anacron (/etc/init.d/anacron) takes care of jobs missed while the PC was off and cron takes care of things while the PC is on.

Revision history for this message
Daniel Richard G. (skunk) wrote :

Hi Stuart,

Note that Anacron is not a daemon; it needs to be executed at boot time and intermittently thereafter (via that cron.d script).

It doesn't work to have Anacron run only at boot time and Cron thereafter, because Anacron maintains state in /var/spool/anacron/ that needs to be updated each time it runs. If you look at /etc/crontab, you'll see that Cron does relatively little when Anacron is installed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.