libmemcached resets continuum with dead server

Bug #881983 reported by Matt Millar
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
libmemcached
Fix Released
Wishlist
Brian Aker
libmemcached (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Testing with pylibmc

import pylibmc
hosts = ["10.234.34.32","10.224.65.34","10.224.71.109"]
Using libmemcached 0.53

import pylibmc
hosts = ["IP_of_host_1_here","IP_of_host_2_here","IP_of_host_3_here"]
mc = pylibmc.Client(hosts,binary=False)
mc.behaviors['remove_failed']=3
mc.behaviors['hash']='md5'
mc.behaviors['distribution']='consistent ketama'

last_exception = None
while True:
   try:
     mc.set("key","value")
     print mc.get("key")
   except Exception as e:
     print e

3 servers running, works fine
takedown the server handing the load
libmemcached returns
error 47 from memcached_set: SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY
until the number of retries has been reached, at which point the server is removed from the pool and the continuum is recalculated.

A different server starts handling the key, until... the retry timeout expires again, at which point the continuum is recalculated with the dead server back in, and now all calls fail with

error 35 from memcached_set: SERVER IS MARKED DEAD

What should happen is that after the timeouts there should be a single return of

error 35 from memcached_set: SERVER IS MARKED DEAD

After which the continuum is recalculated and values go to the new server.

Fix is to mark the server as dead, and exclude dead servers whenever recalculating the continuum (only works for consistent distributions - but that's what I'm using)

Tags: patch
Revision history for this message
Matt Millar (millarm) wrote :
Revision history for this message
Matt Millar (millarm) wrote :

note - you'll need to use pylibmc 1.2.1 to reproduce this

Revision history for this message
Matt Millar (millarm) wrote :

I'm pretty happy with this code - we've run it in production with large cluster (4,200 apache processes, 18 separate cache servers) and tested server death. Works nicely.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Patch to support server death on consistent distributions with 0.53" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Brian Aker (brianaker) wrote :

For this behavior you need to create a new behavior type, and add a test case.

If we were to apply this patch as is, we would break anyone who is expecting the current behavior (which comes into play when servers are replaced in place, or having flapping network service).

Changed in libmemcached:
importance: Undecided → Wishlist
Revision history for this message
Matt Millar (millarm) wrote :

OK - I agree that the best way to do this is to create a new behaviour type - in fact thinking about this I'd like to make the "retry after marked dead" time orthogonal to the "retry on timeout" time.

So the behaviour becomes:

1. Server is good - we keep retrying on the timeout until we hit the number of failures limit
2. We mark the server as dead, and it stays dead until the "retry after market dead" timeout expires
3. We then retry a server connection, and if successful add the server back to the pool - back to the state before 1. If fail we reset the retry timeout

Now we need to do this carefully with consistent distribution, as at 2. the keys have moved to a different server, and we don't want to reset those cache keys when we add the server back to the pool (but we'd like it to accept new keys)

I'm _not_ sure I agree that anyone would be happy with the current behaviour though - as what happens now is:

1. Timeout for num_retries * retry_timeout
2. Move keys to another server for 1 x retry_timeout time
3. Timeout forever

So you get keys moved for 1x the retry_timeout time that's pretty strange behaviour.

Revision history for this message
Trevor North (trevor) wrote :

I have a branch which more or less achieves what is described above by way of a new dead server retry timeout behaviour.

As per the current behaviour with consistent distribution and auto ejection, keys on the dead server are moved after we hit the initial failure limit by taking that host out of the continuum. We then reset the continuum to force a retry every time we hit the dead server retry timeout in the same manner as is done for standard connection retries. Each dead server retry will result in a miss if the host is not actually available which isn't ideal but I wanted to achieve this whilst maintaining compatibility with current behaviour so have kept the changes to a bare minimum.

It's worth noting here that there are a couple of instances where an IO failure would incorrectly reset a server state to new even if it was already in timeout. I've corrected this when setting the state although I suspect the IO in question probably shouldn't be being attempted in the first place in some cases.

I've made no attempt to leave keys on their newly allocated servers once the dead server is brought back to life and I don't believe it would be sensible to do so. With multiple clients running, network flapping would result in effectively random distribution if we attempted to did this negating the point of the use of consistent distribution.

Bar the correction to the server state reset on IO failure when in timeout the changes introduced do not alter the behaviour currently seen if the new dead retry timeout is not used so they should be completely backwards compatible.

The branch is available at https://code.launchpad.net/~trevor/libmemcached/dead-retry and I've attached a patch which will apply to 1.0.2.

Feedback would be welcome as ideally this isn't something I want to have to maintain separately.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libmemcached (Ubuntu):
status: New → Confirmed
Revision history for this message
Brian Aker (brianaker) wrote : Re: [Bug 881983] libmemcached resets continuum with dead server
Download full text (4.1 KiB)

Thanks, I will take a look at it.

Making it a new behavior makes it much more likely that I can accept it.

Cheers,
 -Brian

On Dec 12, 2011, at 1:36 PM, Trevor North wrote:

> I have a branch which more or less achieves what is described above by
> way of a new dead server retry timeout behaviour.
>
> As per the current behaviour with consistent distribution and auto
> ejection, keys on the dead server are moved after we hit the initial
> failure limit by taking that host out of the continuum. We then reset
> the continuum to force a retry every time we hit the dead server retry
> timeout in the same manner as is done for standard connection retries.
> Each dead server retry will result in a miss if the host is not actually
> available which isn't ideal but I wanted to achieve this whilst
> maintaining compatibility with current behaviour so have kept the
> changes to a bare minimum.
>
> It's worth noting here that there are a couple of instances where an IO
> failure would incorrectly reset a server state to new even if it was
> already in timeout. I've corrected this when setting the state although
> I suspect the IO in question probably shouldn't be being attempted in
> the first place in some cases.
>
> I've made no attempt to leave keys on their newly allocated servers once
> the dead server is brought back to life and I don't believe it would be
> sensible to do so. With multiple clients running, network flapping
> would result in effectively random distribution if we attempted to did
> this negating the point of the use of consistent distribution.
>
> Bar the correction to the server state reset on IO failure when in
> timeout the changes introduced do not alter the behaviour currently seen
> if the new dead retry timeout is not used so they should be completely
> backwards compatible.
>
> The branch is available at
> https://code.launchpad.net/~trevor/libmemcached/dead-retry and I've
> attached a patch which will apply to 1.0.2.
>
> Feedback would be welcome as ideally this isn't something I want to have
> to maintain separately.
>
> ** Attachment added: "Add dead server retry"
> https://bugs.launchpad.net/libmemcached/+bug/881983/+attachment/2629812/+files/backoff-dead-reconnect
>
> --
> You received this bug notification because you are subscribed to
> libmemcached.
> https://bugs.launchpad.net/bugs/881983
>
> Title:
> libmemcached resets continuum with dead server
>
> Status in libmemcached - A C and C++ client library for memcached:
> New
> Status in “libmemcached” package in Ubuntu:
> New
>
> Bug description:
> Testing with pylibmc
>
> import pylibmc
> hosts = ["10.234.34.32","10.224.65.34","10.224.71.109"]
> Using libmemcached 0.53
>
> import pylibmc
> hosts = ["IP_of_host_1_here","IP_of_host_2_here","IP_of_host_3_here"]
> mc = pylibmc.Client(hosts,binary=False)
> mc.behaviors['remove_failed']=3
> mc.behaviors['hash']='md5'
> mc.behaviors['distribution']='consistent ketama'
>
> last_exception = None
> while True:
> try:
> mc.set("key","value")
> print mc.get("key")
> except Exception as e:
> print e
>
> 3 servers running, works fine
> takedown the server handing t...

Read more...

Revision history for this message
Brian Aker (brianaker) wrote :

BTW we are getting a couple of test failures after this applied. I am going to look at it myself and see what the issue might be. We have this slated for 1.0.3

Revision history for this message
Zhengxu (eyun221) wrote :

Just now i found that 1.0.3 has been released out, so i want to know that , is this bug fixed in 1.0.3?
the behaviour above told is what i need in my cache severs environment.
3ks very much .

Revision history for this message
Trevor North (trevor) wrote :

Great to see the inclusion in 1.0.3.

Having a quick skim of the code in trunk, the changes I made to prevent server state being reset to new if it was in timeout don't appear to have been applied. I've not had time to take the build for a spin yet, has that issue has been dealt with further up the chain?

Cheers,
Trevor

Revision history for this message
Brian Aker (brianaker) wrote :

Some test cases wouldn't pass with the code handling the reset of state, so I didn't apply that portion of the patch (I still wanted to get in your other bits).

Want to open a different bug on that so that someone here can track it?

Thanks.

Changed in libmemcached:
status: New → Fix Released
assignee: nobody → Brian Aker (brianaker)
Matthias Klose (doko)
Changed in libmemcached (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.