object replicator update_deleted post ssync REPLICATE request considered harmful

Bug #1818709 reported by clayg
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Unassigned

Bug Description

To summarize ongoing work on rebalance improvements [1] REPLICATE requests have a very course API, they are ostensibly designed to speed up replication, but are known to cause a significant amount of IO contention and can be slow UNDER SOME CIRCUMSTANCES.

One such circumstance is when using write-affinity and SSYNC

Often the "handoff part" in a write affinity cluster will be very very sparse (a single object?) compared to the remote partition which may be very very dense (consider LOSF). Currently the replicator update_deleted and reconstructor revert methods both fire a REPLICATE request after syncing a partition to a remote primary to cause an immediate invalidation and recalculation on all synced suffixes. When using SSYNC this is not necessary because any updated suffixes are invalidated inline while syncing objects.

With a write affinity cluster using rsync replication the post REPLICATE request is unavoidable because rsync will ship new objects "underneath" the current suffix hashes.pkl - we must at a MINIMUM invalidate the suffixes that have been synced [2].

However when a write affinity cluster is using SSYNC the post REPLICATE request is an unfavorable IO trade-off often taking more IO than the object transfer with less granular concurrency control to shape the IO budget.

Since the reconstructor can never use rsync we should remove the post-revert-rehash-remote call.

When the replicator is using SSYNC we should skip the post-update-deleted-REPLICATE request.

1. https://etherpad.openstack.org/p/swift-rebalance
2. It's not obvious at all that we SHOULD only do suffix invalidation when using rsync as one of the valuable "side-effects" of suffix recalculation (despite being expensive) is reaping old data files that are "over-written" but newer tombstones.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.27.0

This issue was fixed in the openstack/swift 2.27.0 release.

Revision history for this message
Tim Burke (1-tim-z) wrote :
Changed in swift:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.