Memcache token backend eventually stops working

Bug #1012381 reported by Rafael Durán Castañeda
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Fix Released
High
Rafael Durán Castañeda
Essex
Fix Released
High
Alan Pevec
keystone (Ubuntu)
Fix Released
Undecided
Unassigned
Precise
Fix Released
Undecided
Unassigned

Bug Description

Hi,

At BVOX, my company, we've got a weird issue while using the memcache token backend. Eventually, the token validation stops working and after some debugging, I've detected that the error is always triggered after a token validation request, returning the information about the token used for validation instead of the token being validating., e.g.:

GET /v2.0/tokens/123
x-auth-token: 456

Returns information about 456 instead of 123.

Working on the issue, under heavy load, we've also got this error:

RuntimeError: Second simultaneous read on fileno XX detected.
Unless you really know what you're doing, make sure that only one greenthread can read any particular socket.
Consider using a pools.Pool.

This error can be solved just monkey patching the threading module (as Nova does and I think Glance too).

I've attached a quite simple Python script using multiprocessing that is able to trigger the error quite fast, it drops some requests however token validation still works (3 of 4 processes dropped in my tests); not matching what happens on real deployment, where the token validation never works again until Memcached is restarted (I think just because of the ptython-memcache connection reset).

Revision history for this message
Rafael Durán Castañeda (rafadurancastaneda) wrote :
Revision history for this message
Rafael Durán Castañeda (rafadurancastaneda) wrote :

I can send a patch, unless someone has a good reason to Keystone not monkey patching the threading module.

Revision history for this message
Rafael Durán Castañeda (rafadurancastaneda) wrote :

P.S.: This bug can be triggered both under stable/essex and master branches.

Joseph Heck (heckj)
Changed in keystone:
status: New → Triaged
importance: Undecided → High
Joseph Heck (heckj)
tags: added: essex
Changed in keystone:
assignee: nobody → Rafael Durán Castañeda (rafadurancastaneda)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (master)

Fix proposed to branch: master
Review: https://review.openstack.org/8707

Changed in keystone:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (master)

Reviewed: https://review.openstack.org/8707
Committed: http://github.com/openstack/keystone/commit/3f9f77af19c748658629a460bc447fe7f2d0a410
Submitter: Jenkins
Branch: master

commit 3f9f77af19c748658629a460bc447fe7f2d0a410
Author: Rafael Durán Castañeda <email address hidden>
Date: Tue Jun 19 20:35:43 2012 +0200

    Monkey patching 'thread'.

    Fixes bug 1012381.

    Change-Id: Icb7b2372df96d647fc6dcd4c4ebe72c8aa607f9d

Changed in keystone:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to keystone (stable/essex)

Fix proposed to branch: stable/essex
Review: https://review.openstack.org/9011

Alan Pevec (apevec)
tags: added: essex-backport
removed: essex
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to keystone (stable/essex)

Reviewed: https://review.openstack.org/9011
Committed: http://github.com/openstack/keystone/commit/d8dbdbced061fa4a4e42ec33c4b7e7752b0ebc04
Submitter: Jenkins
Branch: stable/essex

commit d8dbdbced061fa4a4e42ec33c4b7e7752b0ebc04
Author: Rafael Durán Castañeda <email address hidden>
Date: Tue Jun 19 20:35:43 2012 +0200

    Monkey patching 'thread'.

    Fixes bug 1012381.

    Change-Id: Icb7b2372df96d647fc6dcd4c4ebe72c8aa607f9d

Thierry Carrez (ttx)
Changed in keystone:
milestone: none → folsom-2
status: Fix Committed → Fix Released
Dave Walker (davewalker)
Changed in keystone (Ubuntu):
status: New → Fix Released
Changed in keystone (Ubuntu Precise):
status: New → Confirmed
Revision history for this message
Adam Gandelman (gandelman-a) wrote : Verification report.

Please find the attached test log from the Ubuntu Server Team's CI infrastructure. As part of the verification process for this bug, Keystone has been deployed and configured across multiple nodes using precise-proposed as an installation source. After successful bring-up and configuration of the cluster, a number of exercises and smoke tests have be invoked to ensure the updated package did not introduce any regressions. A number of test iterations were carried out to catch any possible transient errors.

Please Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the Jenkins links in the comments of the relevant upstream code-review(s):

Trunk review: https://review.openstack.org/8707
Stable review: https://review.openstack.org/9011

As per the provisional Micro Release Exception granted to this package by the Technical Board, we hope this contributes toward verification of this update.

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Test coverage log.

tags: added: verification-done
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package keystone - 2012.1+stable~20120824-a16a0ab9-0ubuntu2

---------------
keystone (2012.1+stable~20120824-a16a0ab9-0ubuntu2) precise-proposed; urgency=low

  * New upstream release (LP: #1041120):
    - debian/patches/0013-Flush-tenant-membership-deletion-before-user.patch:
      Dropped.
  * Resynchronize with stable/essex:
    - authenticate in ldap backend doesn't return a list of roles
      (LP: #1035428)
    - LDAP should not check username on "sn" field (LP: #997700)
    - Admin API doesn't valid token. (LP: #1006815, #1006822)
    - Memcache token backend eventually stops working. (LP: #1012381)
    - EC2 credentials not migrated from legacy (diablo) database. (LP: #1016056)
    - Deleting tenants or users does not cleanup metadata. (LP: #973243)
    - Deleting tenants does not cleanup its user associations. (LP: #974199)
    - TokenNotFound not raised in testsuite beacuse of timezone issues. (LP: #983800)
    - Token authentication for a user in a disabled tenant does not raise
      Unauthorized error. (LP: #988920)
    - export_legacy_catalog doesn't convert url names correctly. (LP: #994936)
    - Following a password compromise and subsequent password change,
      tokens remain valid. (LP: #996595)
    - Tokens remain valid after a user account is disabled. (LP: #997194)
 -- Adam Gandelman <email address hidden> Fri, 24 Aug 2012 03:34:59 -0400

Changed in keystone (Ubuntu Precise):
status: Confirmed → Fix Released
Thierry Carrez (ttx)
Changed in keystone:
milestone: folsom-2 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.