Race condition between status and backup

Bug #1266763 reported by Kurt Huwig
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
Unassigned
duplicity (Debian)
Fix Released
Unknown
duplicity (Ubuntu)
Fix Released
Medium
Unassigned
Precise
Fix Released
Medium
Unassigned
Quantal
Won't Fix
Medium
Unassigned
Raring
Won't Fix
Medium
Unassigned
Saucy
Fix Released
Medium
Unassigned
Trusty
Fix Released
Medium
Unassigned

Bug Description

N.B. This should not be released until after deja-dup bug 1281066.

SRU justification : Race condition exist when two instances of duplicity run in the same
cache directory (.cache/duplicity)

Impact : Potential corruption of the local cache

Fix : Add a lockfile in the local cache & prevent execution of a second instance in the
case of the presence of the lockfile

Test Case :
1) Run one instance of duplicity :
 $ sudo mkdir /tmp/backup
 $ sudo duplicity --exclude=/proc --exclude=/sys --exclude=/tmp / file:///tmp/backup

While this command is running execute the following in a separate console :
 $ sudo duplicity collection-status file:///tmp/backup

With the new locking mechanism you will see the following :
$ sudo duplicity collection-status file:///tmp/backup
Another instance is already running with this archive directory
If you are sure that this is the only instance running you may delete
the following lockfile and run the command again :
   /home/ubuntu/.cache/duplicity/3fe07cc0f71075f95f411fb55ec60120/lockfile.lock

Regression : In the case of spurrious interruption of duplicity, the lockfile
will remain in .cache/duplicity which can prevent future use of duplicity. The cache
directory will have to be cleaned as outlined in the error message

Original description of the problem :

When runnining "duply X status" while running "duply X backup" (sorry, I don't know which duplicity commands are created by duply) due to a race-condition the code of 'sync_archive' might happend to append newly created meta-data files to 'local_spurious' and subsequently delete them. This delete seems to have been the reason that triggered bug 1216921.

The race condition is that the backup command constantly creates meta-data files while the status command queries the list of local and remote files independently over a larger time span. This means that a local file might already been remote but the status command did not see it a few seconds ago.

Revision history for this message
Matthias Witte (matthias-witte-7) wrote :

"duply foo status" corresponds to "duplicity collection-status ..."

Revision history for this message
Kurt Huwig (k-huwig-f) wrote :
Revision history for this message
Kurt Huwig (k-huwig-f) wrote :

My patch has a dependency on "python-lockfile".

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote : Re: [Bug 1266763] Re: Race condition between status and backup

I'm assuming you mean to allow the lockfile to keep duplicity from
operating on the same backup, or do you intend to keep duplicity from
running two or more backups at the same time. This solution solves the
problem of the same backup only if the user issues subsequent commands with
the same lockfile name. I think we should probably just put the lockfile
in the local cache, under a fixed name. That would take some of the burden
off the user, which is the weak link in the chain.

On Mon, Jan 13, 2014 at 8:56 AM, Kurt Huwig <email address hidden> wrote:

> My patch has a dependency on "python-lockfile".
>
> --
> You received this bug notification because you are subscribed to
> Duplicity.
> https://bugs.launchpad.net/bugs/1266763
>
> Title:
> Race condition between status and backup
>
> Status in Duplicity - Bandwidth Efficient Encrypted Backup:
> New
>
> Bug description:
> When runnining "duply X status" while running "duply X backup" (sorry,
> I don't know which duplicity commands are created by duply) due to a
> race-condition the code of 'sync_archive' might happend to append
> newly created meta-data files to 'local_spurious' and subsequently
> delete them. This delete seems to have been the reason that triggered
> bug 1216921.
>
> The race condition is that the backup command constantly creates meta-
> data files while the status command queries the list of local and
> remote files independently over a larger time span. This means that a
> local file might already been remote but the status command did not
> see it a few seconds ago.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/duplicity/+bug/1266763/+subscriptions
>

Revision history for this message
Kurt Huwig (k-huwig-f) wrote :

I agree with you Ken. The problem arises from two instances changing the same local cache, so it makes sense to put the lockfile there to protect the cache. I'm off now for the night. If you need me to do the change, just tell me and I'll do it tomorrow.

Revision history for this message
Kurt Huwig (k-huwig-f) wrote :
Revision history for this message
Kurt Huwig (k-huwig-f) wrote :
Revision history for this message
Kurt Huwig (k-huwig-f) wrote :

I've created a patch that will add an option

    --single-instance

when used, concurrent runs with the same archive directory will immediately stop with the message

Another instance is already running with this archive directory

and a non-zero error code. This last version of the patch inserts the option at the correct alphabetical position. Please ignore the previous versions.

Revision history for this message
Reik Keutterling (r-keutterling) wrote :

I would enable locking by default, and add a option to disable it on request "--no-locking", otherwise the user still has to append "--single-instance" to each command.

Revision history for this message
Kurt Huwig (k-huwig-f) wrote :

Reik is right: There is a race-condition that happened to delete important data in the past breaking backups. I opted for the conservative approach to make the fix optional, as prohibiting concurrent execution is not a minor change. Still it is the most straighforward fix for the race-condition.

So I suggest an option "--allow-concurrency" (I hate negative boolean options ... think of "NOT NULL = false" .. just confusing) that would disable the lock (IMHO options should be named what they do, not how they do it).

Revision history for this message
Louis Bouchard (louis) wrote :

Ken, I have started to work on adapting Kurt's patch and should have a Merge Proposal soon. Just let me know if you disagree on the logic of the proposed patch.

Changed in duplicity:
status: New → In Progress
assignee: nobody → Louis Bouchard (louis-bouchard)
Revision history for this message
Louis Bouchard (louis) wrote :

@kurt:
More attention will be required for the lockfile housekeeping. A simple test fails because of a lingering lockfile. Here is an example :

ubuntu@PreciseS:~$ duplicity --exclude /backup --exclude /proc / file:///backup
Local and Remote metadata are synchronized, no sync needed.
Last full backup date: none
GnuPG passphrase:
Retype passphrase to confirm:
No signatures found, switching to full backup.
Error accessing possibly locked file /boot/System.map-3.5.0-23-generic
....
Error accessing possibly locked file /home/ubuntu/.viminfo
[Errno 13] Permission denied: '/backup/duplicity-full.20140116T140006Z.vol1.difftar.gpg'
ubuntu@PreciseS:~$ duplicity --exclude /backup --exclude /proc / file:///backup
Local and Remote metadata are synchronized, no sync needed.
Last full backup left a partial set, restarting.
Last full backup date: Thu Jan 16 15:00:06 2014
GnuPG passphrase:
Retype passphrase to confirm:
RESTART: The first volume failed to upload before termination.
         Restart is impossible...starting backup from beginning.
Another instance is already running with this archive directory

Revision history for this message
Louis Bouchard (louis) wrote :

I am affraid that the whole logic of the patch is incorrect. The reason why the test above fails is that, when the "starting backup from the beginning" occurs, duplicity does the following :

                log.Notice(_("RESTART: The first volume failed to upload before termination.\n"
                             " Restart is impossible...starting backup from beginning."))
                self.last_backup.delete()
                os.execve(sys.argv[0], sys.argv, os.environ)

So the os.execve restart a new duplicity instance while the lock is being present, hence the concurrency warning & exit. I think that close scrutiny is required in order to implement proper locking.

I'm looking into it, but I would appreciate Ken's vision of my analysis

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

If you remove the lock before restart you will introduce a small window
where another duplicity task could run, but that would be the only
downside. So, just remove the lock and allow duplicity to try again. It's
simple and if another task gets in, there should be no harm.

BTW, I think locking by default is the correct approach in all of this, at
the cache directory level. Some people run multiple duplicity instances in
parallel, but to different backup dirs. That's perfectly OK.

On Thu, Jan 16, 2014 at 8:32 AM, Louis Bouchard <
<email address hidden>> wrote:

> I am affraid that the whole logic of the patch is incorrect. The reason
> why the test above fails is that, when the "starting backup from the
> beginning" occurs, duplicity does the following :
>
> log.Notice(_("RESTART: The first volume failed to upload
> before termination.\n"
> " Restart is impossible...starting
> backup from beginning."))
> self.last_backup.delete()
> os.execve(sys.argv[0], sys.argv, os.environ)
>
> So the os.execve restart a new duplicity instance while the lock is
> being present, hence the concurrency warning & exit. I think that close
> scrutiny is required in order to implement proper locking.
>
> I'm looking into it, but I would appreciate Ken's vision of my analysis
>
> --
> You received this bug notification because you are subscribed to
> Duplicity.
> https://bugs.launchpad.net/bugs/1266763
>
> Title:
> Race condition between status and backup
>
> Status in Duplicity - Bandwidth Efficient Encrypted Backup:
> In Progress
>
> Bug description:
> When runnining "duply X status" while running "duply X backup" (sorry,
> I don't know which duplicity commands are created by duply) due to a
> race-condition the code of 'sync_archive' might happend to append
> newly created meta-data files to 'local_spurious' and subsequently
> delete them. This delete seems to have been the reason that triggered
> bug 1216921.
>
> The race condition is that the backup command constantly creates meta-
> data files while the status command queries the list of local and
> remote files independently over a larger time span. This means that a
> local file might already been remote but the status command did not
> see it a few seconds ago.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/duplicity/+bug/1266763/+subscriptions
>

Revision history for this message
Louis Bouchard (louis) wrote :

Ken, I will check where this can be an issue and will adapt.

I also added lock release at the bottom of the duplicity.py after the following :

 __name__ == "__main__":
    try:
        with_tempdir(main)

    # Don't move this lower. In order to get an exit
    # status out of the system, you have to call the
    # sys.exit() function. Python handles this by
    # raising the SystemExit exception. Cleanup code
    # goes here, if needed.
    except SystemExit, e:
        # No traceback, just get out

My take is that it needs to be done for every except section.

I'll do my best to propose a MP soon.

Thanks for your insight

Louis Bouchard (louis)
Changed in duplicity (Ubuntu):
status: New → In Progress
assignee: nobody → Louis Bouchard (louis-bouchard)
Chris J Arges (arges)
Changed in duplicity (Ubuntu Precise):
assignee: nobody → Louis Bouchard (louis-bouchard)
status: New → In Progress
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "This patch adds a new command line parameter "--lockfile" that allows to define a lockfile which will prohibit parallel executions on the same host." seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Changed in duplicity:
assignee: Louis Bouchard (louis-bouchard) → nobody
importance: Undecided → Medium
milestone: none → 0.6.23
status: In Progress → Fix Committed
Louis Bouchard (louis)
Changed in duplicity (Ubuntu Quantal):
assignee: nobody → Louis Bouchard (louis-bouchard)
Changed in duplicity (Ubuntu Raring):
assignee: nobody → Louis Bouchard (louis-bouchard)
Changed in duplicity (Ubuntu Saucy):
assignee: nobody → Louis Bouchard (louis-bouchard)
Changed in duplicity (Ubuntu Quantal):
status: New → Triaged
Changed in duplicity (Ubuntu Raring):
status: New → Triaged
Changed in duplicity (Ubuntu Saucy):
status: New → Triaged
Changed in duplicity (Ubuntu Precise):
importance: Undecided → Medium
Changed in duplicity (Ubuntu Quantal):
importance: Undecided → Medium
Changed in duplicity (Ubuntu Raring):
importance: Undecided → Medium
Changed in duplicity (Ubuntu Saucy):
importance: Undecided → Medium
Changed in duplicity (Ubuntu Trusty):
importance: Undecided → Medium
Louis Bouchard (louis)
description: updated
Changed in duplicity (Ubuntu Quantal):
status: Triaged → In Progress
Changed in duplicity (Ubuntu Raring):
status: Triaged → In Progress
Changed in duplicity (Ubuntu Saucy):
status: Triaged → In Progress
Revision history for this message
Louis Bouchard (louis) wrote :

Unsubscribing SRU Team & sponsors for now : there is ongoing discussion with upstream on the validity of the --allow-concurrency option so please WAIT AND DO NOT MERGE for now

Revision history for this message
Louis Bouchard (louis) wrote :

The result of our email discussion is that the --allow-concurrency option needs to be removed but the locking mechanism stays. A note will be added to the error message to indicate the lockfile that needs to be removed to overturn the locking if it has been left behind from a failed session.

I switched back the upstream status to In Progress to highlight the fact that the solution that was merge is now incomplete. I'm starting to work on a new patsh that should come in shortly

Changed in duplicity:
status: Fix Committed → In Progress
Louis Bouchard (louis)
description: updated
Louis Bouchard (louis)
description: updated
Louis Bouchard (louis)
Changed in duplicity:
status: In Progress → Fix Committed
Changed in duplicity:
status: Fix Committed → Fix Released
Revision history for this message
Michael Terry (mterry) wrote :

So one problem I noticed in trusty was that this feature causes a problem with deja-dup. These will need a similar SRU for deja-dup, combining the changes to DuplicityInstance.vala in upstream commits 1527, 1528, and 1529.

Revision history for this message
Louis Bouchard (louis) wrote :

Bug created to SRU trusty's fix to Precise & Saucy

https://bugs.launchpad.net/ubuntu/+source/deja-dup/+bug/1281066

Revision history for this message
Louis Bouchard (louis) wrote :

To potential sponsor:

I am preparing a slight modification (add Breaks: statement). debdiff should be there shortly

Revision history for this message
Louis Bouchard (louis) wrote :

Same version of the patch with a "Breaks: "statement in the control file to signal dependancy with deja-dup

Revision history for this message
Michael Terry (mterry) wrote :

Trusty now has 0.6.23 with has the two relevant patches built in. And I've added the Breaks line.

Changed in duplicity (Ubuntu Trusty):
assignee: Louis Bouchard (louis-bouchard) → nobody
status: In Progress → Fix Released
Revision history for this message
Louis Bouchard (louis) wrote :

Equivalent debdiff with Breaks: statement added

Changed in duplicity (Ubuntu Quantal):
status: In Progress → Won't Fix
Changed in duplicity (Ubuntu Raring):
status: In Progress → Won't Fix
Changed in duplicity (Ubuntu Quantal):
assignee: Louis Bouchard (louis-bouchard) → nobody
Changed in duplicity (Ubuntu Raring):
assignee: Louis Bouchard (louis-bouchard) → nobody
Revision history for this message
Louis Bouchard (louis) wrote :

Equivalent debdiff with Breaks: statement added

Revision history for this message
Michael Terry (mterry) wrote :

Uploaded to precise. Louis, you probably should fill out the other bugs with their SRU headers (stuff like how to test the fix etc) in preparation for SRU verification.

Revision history for this message
Michael Terry (mterry) wrote :

And uploaded to saucy.

Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Kurt, or anyone else affected,

Accepted duplicity into saucy-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/duplicity/0.6.21-0ubuntu4.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in duplicity (Ubuntu Saucy):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Louis Bouchard (louis) wrote :

Any reason why the package for precise has not been uploaded ?

tags: added: verification-done
removed: verification-needed
tags: added: verification-done-saucy
removed: verification-done
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Kurt, or anyone else affected,

Accepted duplicity into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/duplicity/0.6.18-0ubuntu3.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in duplicity (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Louis Bouchard (louis)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package duplicity - 0.6.18-0ubuntu3.5

---------------
duplicity (0.6.18-0ubuntu3.5) precise; urgency=low

  * debian/patches/14-lp946988-fix-gnupassphrase-error.dpatch
    - Backport upstream modification applied in LP: #946988 that fix
      a failure to resume a backup with "bad passphrase" when the proper
      passphrase is being used.

  * debian/patches/13-lp1266763-add-concurrency-locking.dpatch
    - Implement locking mechanism to avoid concurrent execution under the same
      cache directory. This functionality adds a dependency to python-lockfile
      Fixes LP: #1266763

  * debian/patches/12-lp1266753-exception-if-no-s3.dpatch
    - Add exception handling in the case where no S3 connection is
      available instead of silently deleting the local cache.
      Fixes LP: #1266753
 -- Louis Bouchard <email address hidden> Thu, 23 Jan 2014 10:16:32 +0100

Changed in duplicity (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for duplicity has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package duplicity - 0.6.21-0ubuntu4.2

---------------
duplicity (0.6.21-0ubuntu4.2) saucy; urgency=low

  * debian/patches/05-lp1266763-add-concurrency-locking.dpatch
    - Implement locking mechanism to avoid concurrent execution under the same
      cache directory. This functionality adds a dependency to python-lockfile
      Fixes LP: #1266763

  * debian/patches/04-lp1266753-exception-if-no-s3.dpatch
    - Add exception handling in the case where no S3 connection is
      available instead of silently deleting the local cache.
      Fixes LP: #1266753
 -- Louis Bouchard <email address hidden> Fri, 24 Jan 2014 13:44:06 +0100

Changed in duplicity (Ubuntu Saucy):
status: Fix Committed → Fix Released
Louis Bouchard (louis)
Changed in duplicity (Ubuntu Precise):
assignee: Louis Bouchard (louis-bouchard) → nobody
Changed in duplicity (Ubuntu Saucy):
assignee: Louis Bouchard (louis-bouchard) → nobody
Changed in duplicity (Debian):
status: Unknown → New
Changed in duplicity (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.