Deleted container in deleted account gets unreclaimable after rebalance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Invalid
|
Undecided
|
Takashi Kajinami |
Bug Description
Container db file may get reclaimed one week (reclaim_age) after it gets deleted, but rebalancing sometimes makes a deleted container unreclaimable.
I describe the sequence which ends up with an irreclaimable container.
1. A client deletes an account.
2. "delay_reaping" seconds after that deleting, account reaper reaps containers and objects under the deleted account. The containers and objects get newer deleted_timestamp than the account.
3. Container-updater sends container information to account, and update reported_
4. Account-replicator reclaims the account db file one week after the account get deleted(1).
5. An operator adds devices to the swift cluster, and container moves to one of the new devices. At this time, db_replicator uses complete_rsync mode, and reset reported_
6. Container-updater tries to send information to account, because reported_
7. Container-
In my environment, I set delay_reaping as 1 day and faced this situation.
What is worse, that unreclaimable container causes never-ending update, which continues to generate error log in account-server.log and container-
Changed in swift: | |
assignee: | nobody → Takashi Kajinami (kajinamit) |
status: | New → In Progress |
Is this a duplicate of lp bug #1300850
so the workaround here is to set the account reclaim age to a few days longer that your container reclaim and both of them plenty of time after reaping will finish and fully replicate.
I think there *is* still a risk that old container wakes up out of no-where and can't seem to reclaim itself because the account is long gone. But it feels like that could happen in a variety of situations if a disk is isolated from the consistency engine longer than a reclaim age. So it seems like the change might happen closer to report?
If we can detect the 404 in the container-updater and the last reported data is going on reclaim age - maybe the best we can do is quarantine the thing? Unless all we're trying to report is 0 objects and 0 bytes - in which case we can probably mark it reported or go ahead and reclaim it.
I like the idea of trying to make sure we're always doing the right thing with newid handling - but those id's are used in the [incoming| outgoing] _sync tables to keep track of transitive replication relationships [1] - so we need to be very careful where we pull out a re-id - except in the usync db case we're shipping row-by-row - I'm not sure we can guarantee the invariant?
1. when container a > container b > container c then container a > container c