Ubuntu
apport package

stale lock prevents apport runs

Bug #137567 reported by C de-Avillez on 2007-09-05

6

Affects		Status	Importance	Assigned to	Milestone
	apport (Ubuntu)	Invalid	Undecided	Unassigned

Bug Description

Binary package hint: apport

Today I succeeded in crashing Evolution-Data-Server, and I got a crash report created in /var/crash. Nevertheless apport-retrace refused to retrace this crash stating that a required keyword was missing from the report.

Looking at /var/log/apport I see an entry for the crash with this message: "another apport instance is already running, aborting", so this is the reason for the missing keyword, I guess.

The problems:

1. there was *no* other instance of apport running at the time of the crash;
2. the /var/crash/.lock lock file was dated Sep 5th 2007. Since then this machine has been rebooted several times.

So... stale lock. Since the .lock file was owned by root, it might well be that my apport run (being run as myself) did not have the necessary privilege to delete a file owned by root. I do not know, have not had time to look at the code.

Although I do understand the need to throttle apport simultaneous runs, I also think some sort of cleanup should be implemented; this cleanup would have to take in consideration that the lock file may be owned by another user.

Revision history for this message

C de-Avillez (hggdh2) wrote on 2007-09-07:

#1

bah, the lock file was not dated Sep 5th... it was Aug 5th. Sorry.

Revision history for this message

Daniel Hahler (blueyed) wrote on 2007-10-15:

#2

I have the same issue. It seems like a X server crash left the stale lock file around. (I'm using Option "NoTrapSignals" "true" in xorg.conf, so apport gets to know about the X server crashes).

Changed in apport:
status:	New → Confirmed

Revision history for this message

Daniel Hahler (blueyed) wrote on 2007-10-15:

#3

debdiff for Gutsy Edit (1.7 KiB, text/plain)

This debdiff makes apport use /var/lock/apport as a lockfile instead.
It also makes it exit with code 1 in case the lockfile could not get created (bug 147237).

I've also added a "/bin/rm -f /var/lock/apport" to the init script, in case apport gets restarted.

In fact, the lockfile would not have to be moved to /var/lock: deleting it in the init script's "start" action would be enough. But it seems like having lockfiles in /var/lock sounds reasonable.

Revision history for this message

Martin Pitt (pitti) wrote on 2007-10-27:

#4

The .lock file is not stale at all. It does not matter if it exists, since the lock is done using flock(2), not by merely testing the existence of that file. So the patch would not help in any way. So the reason for this must be entirely different. Can you please attach your /var/log/apport.log?

Changed in apport:
status:	Confirmed → Incomplete

Revision history for this message

Daniel Hahler (blueyed) wrote on 2007-10-27:

#5

I guess then that the reason has been that the lockfile from "root" was left and then already the
fd = os.open(lockfile, os.O_WRONLY|os.O_CREAT|os.O_NOFOLLOW)
failed.

I've just had X crashing again and the lock file left, owned by root.

Revision history for this message

Martin Pitt (pitti) wrote on 2008-02-25:

#6

Daniel, no, that's not it. First, apport is always run as root. Second, if the os.open() fails, apport writes "cannot create lock file" into apport.log and exits. If you got "another apport instance is already running, aborting", then the flock() call failed. If it usually works, then I guess there really was another apport instance running at that time. If it generally fails, it should be reproducible with

python -c "import os, fcntl; fd = os.open('/var/lock/.crash', os.O_WRONLY|os.O_CREAT|os.O_NOFOLLOW); fcntl.lockf(fd, fcntl.LOCK_EX | fcntl.LOCK_NB)"

Does that work for you?

Revision history for this message

Martin Pitt (pitti) wrote on 2008-02-25:

#7

Whoops, sorry. Of course I meant

sudo python -c "import os, fcntl; fd = os.open('/var/crash/.lock', os.O_WRONLY|os.O_CREAT|os.O_NOFOLLOW); fcntl.lockf(fd, fcntl.LOCK_EX | fcntl.LOCK_NB)"

Revision history for this message

Daniel Hahler (blueyed) wrote on 2008-02-25:

#8

Yes, the code snippet works for me and I believe I've created confusion here - not knowing about flock(2) and that apport is always being run as root, sorry.

hggdh, can you give more information? Has the problem happened to you again?

Revision history for this message

C de-Avillez (hggdh2) wrote on 2008-02-26:

#9

Since then I have been monitoring the logs and /var/crash. I never saw it happening again.

So, sorry, no new data. I guess we could close it invalid then.

Revision history for this message

Martin Pitt (pitti) wrote on 2008-02-26:

#10

Thanks, hggdh, for reporting back. Let's close this then until it happens again.

Changed in apport:
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Patches

debdiff for Gutsy Edit

Add patch

Remote bug watches

Bug watches keep track of this bug in other bug trackers.