commit 666bf06c26bc9e0d6256d054835386e50e67b8a2
Author: Samuel Merritt <email address hidden>
Date: Wed May 6 16:29:06 2015 -0700
EC: don't 503 on marginally-successful PUT
On EC PUT in an M+K scheme, we require M+1 fragment archives to
durably land on disk. If we get that, then we go ahead and ask the
object servers to "commit" the object by writing out .durable
files. We only require 2 of those.
When we got exactly M+1 fragment archives on disk, and then one
connection timed out while writing .durable files, we should still be
okay (provided M is at least 3). However, we'd take our M > 2
remaining successful responses and pass that off to best_response()
with a quorum size of M+1, thus getting a 503 even though everything
worked well enough.
Now we pass 2 to best_response() to avoid that false negative.
There was also a spot where we were getting the quorum size wrong. If
we wrote out 3 fragment archives for a 2+1 policy, we were only
requiring 2 successful backend PUTs. That's wrong; the right number is
3, which is what the policy's .quorum() method says. There was a spot
where the right number wasn't getting plumbed through, but it is now.
Reviewed: https:/ /review. openstack. org/180795 /git.openstack. org/cgit/ openstack/ swift/commit/ ?id=666bf06c26b c9e0d6256d05483 5386e50e67b8a2
Committed: https:/
Submitter: Jenkins
Branch: master
commit 666bf06c26bc9e0 d6256d054835386 e50e67b8a2
Author: Samuel Merritt <email address hidden>
Date: Wed May 6 16:29:06 2015 -0700
EC: don't 503 on marginally- successful PUT
On EC PUT in an M+K scheme, we require M+1 fragment archives to
durably land on disk. If we get that, then we go ahead and ask the
object servers to "commit" the object by writing out .durable
files. We only require 2 of those.
When we got exactly M+1 fragment archives on disk, and then one
connection timed out while writing .durable files, we should still be
okay (provided M is at least 3). However, we'd take our M > 2
remaining successful responses and pass that off to best_response()
with a quorum size of M+1, thus getting a 503 even though everything
worked well enough.
Now we pass 2 to best_response() to avoid that false negative.
There was also a spot where we were getting the quorum size wrong. If
we wrote out 3 fragment archives for a 2+1 policy, we were only
requiring 2 successful backend PUTs. That's wrong; the right number is
3, which is what the policy's .quorum() method says. There was a spot
where the right number wasn't getting plumbed through, but it is now.
Change-Id: Ic658a199e95255 8db329268f4d7b4 009f47c6d03
Co-Authored-By: Clay Gerrard <email address hidden>
Closes-Bug: 1452468