rdiff-backup --list-increments does not distinguish between error conditions

Bug #128244 reported by Nick Moffitt
12
Affects Status Importance Assigned to Milestone
rdiff-backup (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Binary package hint: rdiff-backup

When one runs "rdiff-backup --list-increments /path/to/backup/archive", it will often fail to run in situations where "rdiff-backup --list-increment-sizes" will succeed. An interrupted backup, currently-running backup, or other recoverable situation will cause -l to throw an exception and exit with a return code of 1 -- indistinguishable from an unrecoverable error.

In order to work around this and usefully get a value for the date of the most recent useful backup for an archive (the current moment if a backup is currently still running, and the most recent incremental if the previous backup was interrupted), I had to write a python program that re-implemented Main.checkdest_need_check() and regress.check_pids() with versions that merely returned status codes rather than abort completely.

This sort of behavior needs to be part of rdiff-backup from the start, as busy backup locations often appear to be unworkably broken to the command-line tools even when everything is perfectly fine.

Revision history for this message
Andrew Ferguson (adferguson) wrote :

Hi Nick,

This is a good point about the two different behaviors and I agree that --list-increment-sizes is more sensible.

I think the correct fix is to simply remove the call to restore_check_backup_dir() in the ListIncrements(rp) function in Main.py. If you remove that line, does the bothersome behavior go away?

If that works for you (and I'm pretty sure it will), I will remove that line in CVS, which will become rdiff-backup 1.1.13 and subsequent releases.

Thanks for noticing this!

Andrew

Revision history for this message
Nick Moffitt (nick-moffitt) wrote : Re: [Bug 128244] Re: rdiff-backup --list-increments does not distinguish between error conditions

owsla:
> This is a good point about the two different behaviors and I agree
> that --list-increment-sizes is more sensible.
>
> I think the correct fix is to simply remove the call to
> restore_check_backup_dir() in the ListIncrements(rp) function in
> Main.py. If you remove that line, does the bothersome behavior go
> away?

I don't have handy the directories that I tested it in, but the
functions I listed as reimplementing were so that I could basically
write my own restore_check_backup_dir(). Doing as you describe does in
fact cause the system to print a list of increments in most cases.

The code called by that function eventually reaches check functions in
places like regress.py that themselves call sys.exit(1) or similar.
Ideally these should raise application-specific exceptions that the
high-level code can catch and use to inform the decision-making logic
about the safest course of action.

I understand that this is a large codebase doing very critical
operations on sensitive data, and that such a large architectural change
would not come easily. At the very least I would suggest that these
functions such as regress.check_pids() and Main.checkdest_need_check()
exit with unique exit codes so that shell scripts and other forms of
automation can quickly determine what sort of problem has occurred.

If you like, I can show you the custom versions of these functions that
I wrote to determine what the last *successful* backup time was.

> If that works for you (and I'm pretty sure it will), I will remove
> that line in CVS, which will become rdiff-backup 1.1.13 and subsequent
> releases.

I will test this soon and get back to you.

Revision history for this message
Andrew Ferguson (adferguson) wrote :

Nick, thanks for testing this.

If I read your comment correctly, you're looking for a wider range of exit status codes, correct? As far as I can tell, rdiff-backup currently exits with either 0 for no error or 1 for error occurred. It certainly seems like getting some more information would be a good improvement.

Maybe something like this?
0 = no error occurred
1 = unrecoverrable error
2 = another rdiff-backup instance running
3 = recoverrable error (link down, interrupted, etc.)

Revision history for this message
Andrew Ferguson (adferguson) wrote :

Nick,

Have you gotten a chance to test my fix from the first comment? I am currently blocking the 1.1.13 release until you confirm.

Thanks,
Andrew

Revision history for this message
Marco Rodrigues (gothicx) wrote :

This was fixed in version 1.1.14, currently on gutsy ?

Changed in rdiff-backup:
status: New → Incomplete
Revision history for this message
wolfger (wolfger) wrote :

Marking fix released per above comment. Current version is 1.1.14-1.

Changed in rdiff-backup:
status: Incomplete → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

Nick - have you had an opportunity to test the proposed changes? They don't seem to have made it into the latest release as far as I can tell and it sounds like this is something you might want in Hardy.

Changed in rdiff-backup:
status: Fix Released → Incomplete
Revision history for this message
Connor Imes (ckimes) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. You reported this bug a while ago and there hasn't been any activity in it recently. We were wondering is this still an issue for you? Can you try with latest Ubuntu release? Thanks in advance.

Revision history for this message
Nick Moffitt (nick-moffitt) wrote :

This currently still happens in hardy:

$ rdiff-backup --list-increments /tmp/ttt
Fatal Error: Previous backup to /tmp/ttt seems to have failed.
Rerun rdiff-backup with --check-destination-dir option to revert directory to state before unsuccessful session.
$ echo $?
1
$ rdiff-backup --list-increment-sizes /tmp/ttt
        Time Size Cumulative size
-----------------------------------------------------------------------------
[..list of increments and their sizes...]
$ echo $?
0

I never had the opportunity to test the change owsla mentioned in the first comment, but it should be trivial to test the behavior independently on a destination dir that has had an interrupted write to it.

Revision history for this message
wolfger (wolfger) wrote :

Still a problem on Intrepid?

Changed in rdiff-backup:
status: Incomplete → In Progress
Revision history for this message
Otto Kekäläinen (otto) wrote :

Hello!

The new rdiff-backup 2.0 (written in Python 3) has recently been released. Please check out https://github.com/rdiff-backup/rdiff-backup and contribute in this open source project.

This bug is unlikely to apply anymore to latest rdiff-backup 2.0, so it will be closed unless you want to help out and test if you can repeat the bug with the latest rdiff-backup.

Changed in rdiff-backup (Ubuntu):
status: In Progress → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for rdiff-backup (Ubuntu) because there has been no activity for 60 days.]

Changed in rdiff-backup (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.