container-sync should be more resilient to missing objects

Bug #1068423 reported by Faidon Liambotis
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Wishlist
David Hadas

Bug Description

When attempting to sync a container to another cluster, I had the sync halt mid-flight and not ever recover because of an object which was listed in the container but there was nowhere to be found on the object servers. Now, this was obviously a preexisting problem in the cluster, but it might just as well happen because a rebalance operation has rendered the file temporary missing. Container-sync got a 404 from the object servers and the container remained half-synced. There was no way to recover and complete the sync other than DELETEing the file in question.

Additionally, this took some time to debug because the error message presented in such a case is:
            elif err.http_status == HTTP_NOT_FOUND:
                self.logger.info(_('Not found %(sync_from)r '
                    '=> %(sync_to)r'),
                    {'sync_from': '%s/%s' %
                        (quote(info['account']), quote(info['container'])),
                     'sync_to': sync_to})

Note how that message does not mention the object at all -- it just assumes that the problem is that the account or container does not exist, which was obviously not the case here.

Chuck Thier (cthier)
Changed in swift:
importance: Undecided → Wishlist
Chuck Thier (cthier)
Changed in swift:
status: New → Triaged
Changed in swift:
assignee: nobody → David Hadas (david-hadas)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/23041

Changed in swift:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/23041
Committed: http://github.com/openstack/swift/commit/8b140033f01333fbd6d41e2946db949ab6f92599
Submitter: Jenkins
Branch: master

commit 8b140033f01333fbd6d41e2946db949ab6f92599
Author: David Hadas <email address hidden>
Date: Wed Feb 27 00:49:51 2013 +0200

    Improved container-sync resiliency

    container-sync now skips faulty objects in the first and second rounds.
    All replicas try in the second round.
    No server will give up until the faulty object suceeds

    Fixes: bug #1068423

    Change-Id: I0defc174b2ce3796a6acf410a2d2eae138e8193d

Changed in swift:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.8.0-rc1
status: Fix Committed → Fix Released
Changed in swift:
milestone: 1.8.0-rc1 → 1.8.0-rc2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (milestone-proposed)

Fix proposed to branch: milestone-proposed
Review: https://review.openstack.org/25492

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (milestone-proposed)

Reviewed: https://review.openstack.org/25492
Committed: http://github.com/openstack/swift/commit/045076a923bfd66e3e76b66339bc58b51e3af856
Submitter: Jenkins
Branch: milestone-proposed

commit 045076a923bfd66e3e76b66339bc58b51e3af856
Author: David Hadas <email address hidden>
Date: Wed Feb 27 00:49:51 2013 +0200

    Improved container-sync resiliency

    container-sync now skips faulty objects in the first and second rounds.
    All replicas try in the second round.
    No server will give up until the faulty object suceeds

    Fixes: bug #1068423

    Change-Id: I0defc174b2ce3796a6acf410a2d2eae138e8193d
    (cherry picked from commit 8b140033f01333fbd6d41e2946db949ab6f92599)

Thierry Carrez (ttx)
Changed in swift:
milestone: 1.8.0-rc2 → 1.8.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.