KeyError 'options' while doing zero downtime upgrade from N to O

Bug #1687616 reported by Lujin Luo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Invalid
Undecided
Unassigned

Bug Description

I am trying to do a zero downtime upgrade from N release to O release following [1].

I have 3 controller nodes running behind a HAProxy. Everytime, when I upgraded one of the keystone and bring it back to the cluster, it would encounter this error [2] when I tried to update a created user for about 5 minutes. After I brought back all the 3 upgraded keystone nodes, and 5 or more minutes later, this error will disappear and everything works fine.

I am using the same conf file for both releases as shown in [3].

[1]. https://docs.openstack.org/keystone/latest/admin/identity-upgrading.html
[2]. http://paste.openstack.org/show/608557/
[3]. http://paste.openstack.org/show/608558/

Tags: upgrades
tags: added: upgrades
Revision history for this message
Lance Bragstad (lbragstad) wrote :

Are you able to pin-point which keystone server is throwing the error? Is it coming from keystone node running the newton code or the ocata code?

description: updated
Revision history for this message
Lance Bragstad (lbragstad) wrote :

After taking a closer look at the trace, it looks like the error is coming from a node that is running Ocata [0] since the code in the trace doesn't exist in Newton.

Can you confirm that `keystone-manage db_sync --expand` and `keystone-manage db_sync --migrate` have been run before the Ocata code is actually being run? It looks like the reference returned from the database doesn't actually have the `options` field, so I'm wondering if the migrations were done in order to run Ocata code.

[0] https://github.com/openstack/keystone/blob/stable/ocata/keystone/auth/core.py#L377

Revision history for this message
Lance Bragstad (lbragstad) wrote :

Targeting this to Pike until I can recreate.

Changed in keystone:
milestone: none → pike-rc1
Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

Marked as incomplete, we need more information on what the deployer is doing / if migrations are run / etc.

Changed in keystone:
status: New → Incomplete
Changed in keystone:
milestone: pike-rc1 → none
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Identity (keystone) because there has been no activity for 60 days.]

Changed in keystone:
status: Incomplete → Expired
Revision history for this message
Sam Morrison (sorrison) wrote :

I have just done the N -> O upgrade and have seen this error.

We have done the expand and migrate db syncs.

We have 3 newton keystones and when I added an ocata one I saw this issue on the ocata one.

Its happening on a POST to /v3/auth/tokens and is affecting about 3% of requests (we have around 10 requests per second on our keystone)

Happy to provide more information.

Currently I have rolled back but am thinking this might just be an issue during the transition so could bite the bullet and do it quickly.

Changed in keystone:
status: Expired → New
Revision history for this message
Sam Morrison (sorrison) wrote :

OK I figured out the issue, it was due to cached tokens. Need to invalidate the cache.

Revision history for this message
Lujin Luo (luo-lujin) wrote :

Hi Sam, soon after I reported this bug, my colleague found the root cause in our environment. We stopped using memcached server and this somehow fixed our problem. I have no idea why this could fix it. But you may want to give it a try first. Hope this helps.

Lujin Luo (luo-lujin)
Changed in keystone:
status: New → Invalid
Revision history for this message
Lance Bragstad (lbragstad) wrote :

luo-lujin, depending on your configuration, memcache might be caching tokens. Meaning you'd be hitting the same issue Sam hit in comment #7.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.