Comment 9 for bug 1749425

Revision history for this message
James Page (james-page) wrote :

Some sequencing:

unbind from agent message inbound:

2018-02-13 14:12:57.931 2065380 DEBUG neutron.agent.l3.agent [req-13456a6c-5583-4147-9b47-94026ea7f3b4 b327544aba2a482b9f12f1e6e615c394 9a4311b33381401fbc835c739981ce03 - - -] Got router removed from agent :{u'router_id': u'213b6544-ab4b-4e46-a5c6-5d8d587a0c6d'} router_removed_from_agent /usr/lib/python2.7/dist-packages/neutron/agent/l3/agent.py:419

some sort of update message:

2018-02-13 14:13:00.667 2065380 DEBUG neutron.agent.l3.agent [req-13456a6c-5583-4147-9b47-94026ea7f3b4 b327544aba2a482b9f12f1e6e615c394 9a4311b33381401fbc835c739981ce03 - - -] Got routers updated notification :[u'213b6544-ab4b-4e46-a5c6-5d8d587a0c6d'] routers_updated /usr/lib/python2.7/dist-packages/neutron/agent/l3/agent.py:409

and then we see a state change on the router (goes to master) inferring that no teardown has occurred since the first message above:

2018-02-13 14:14:33.113 2065380 DEBUG neutron.agent.l3.ha [-] Handling notification for router 213b6544-ab4b-4e46-a5c6-5d8d587a0c6d, state master enqueue /usr/lib/python2.7/dist-packages/neutron/agent/l3/ha.py:50
2018-02-13 14:14:33.113 2065380 INFO neutron.agent.l3.ha [-] Router 213b6544-ab4b-4e46-a5c6-5d8d587a0c6d transitioned to master

and then:

2018-02-13 14:14:40.380 2065380 DEBUG neutron.agent.l3.ha [-] Spawning metadata proxy for router 213b6544-ab4b-4e46-a5c6-5d8d587a0c6d _update_metadata_proxy /usr/lib/python2.7/dist-packages/neutron/agent/l3/ha.py:156

(I think the agent still thinks this is a HA router...)

and then:

2018-02-13 14:14:59.839 2065380 DEBUG neutron.agent.l3.ha [-] Updating server with HA routers states {'28015629-217b-4eec-b557-6f93a2bb0230': 'active', '237d2839-5687-4104-9270-ff974de59800': 'active', '266167d5-39e5-4d97-a3a8-45d4ddad9407': 'active', '213b6544-ab4b-4e46-a5c6-5d8d587a0c6d': 'active', '2367f6bd-02c1-4ef1-a642-fe54d916fe2e': 'active'} notify_server /usr/lib/python2.7/dist-packages/neutron/agent/l3/ha.py:177

and slight after:

2018-02-13 14:18:08.275 2065380 ERROR neutron.agent.l3.router_info [-] 'NoneType' object has no attribute 'remove_vip_by_ip_address'

This seems to support the theory that the HA router never actually gets torn down before the new non-ha router is scheduled to the same network node, resulting in the agent not really knowing whether the router is arthur or marther.