[RFE] Make neutron agents less chatty on AMQP

Bug #1438159 reported by Attila Fazekas
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Triaged
Wishlist
Unassigned

Bug Description

Problem.: Neutron agents does a lot of periodic task which leads an rpc call + database transaction, which does not even provide a new information, because nothing changed.
This behaviour in scale can be called as `DDOS attack`, generally this kind of architecture is bad at scaling and can be considered as an any-pattern.

Instead of periodic poll, we can leverage the AMQP brokers bind capabilities.
Neutron has many situation like security group rule change or dvr related changes which needs to be communicated with multiple agents, but usually not with all agent.

The agent at startup needs to synchronise the as usual, but during the sync the agent can subscribe to the interesting events to avoid the periodic tasks. (Note.: After the first subscribe loop a second one is needed to do not miss changes during the subscribe process ).

The AMQP queues with 'auto-delete' can be considered as a reliable source of information which does not miss any event notification.
On connection loss or full broker cluster die the agent needs to re sync everything guarded in this way,
in these cases, the queue will disappear so the situation easily detectable.

1. Create a Direct exchange for all kind of resources<type> what needs to be synchronised in this way, for example.: 'neutron.securitygroups' . The exchange declaration needs to happen at q-svc start-up time or at full broker cluster die (not-found exception will tell it). The exchange SHOULD NOT be redeclared or verified at every message publish.

2. Every agent creates a dedicated per agent queue with auto-delete flag, if the agent already maintains a queue with this property he MAY reuse that one. The agents SHOULD avoid to creating multiple queues per resource type. The messages MUST contain a type information.
3. All agent creates a binding between his queue and the resource type queue with he realise he depends on the resource, for example it maintains at least one port with the given security-group. (The agents needs to remove the binding. when they stop using it)
4. The q-svc publishes just a single message when the resource related change happened. The routing key is the uuid.

Alternatively a topic exchange can be used, with a single exchange.
In this case the routing keys MUST contains the resource type like: neutron.<resource_type>.<uuid> ,
this type exchange is generally more expensive than a direct exchange (pattern matching), and only useful if you have agents which needs to listens to ALL event related to a type, but others just interested just in a few of them.

Edit:
Bindings MAY be added by the sender as well.

Tags: loadimpact rfe
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

This looks like deep rework of current messaging strategy and probably should be worked on within a scope of a blueprint rather than a bug.

So I would suggest to file a bp for this and provide a spec explaining these ideas so spec review could be a better place to discuss these ideas.

Changed in neutron:
status: New → Opinion
Revision history for this message
Kevin Benton (kevinbenton) wrote :

Definitely needs a blueprint. Keeping agent consistency has come up in many discussions before and I think we are going to try to refactor this in L.

description: updated
tags: added: loadimpact
Changed in neutron:
status: Opinion → Confirmed
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

This strategy seems to be implemented in context of QoS work in Liberty. We should leverage the mechanism for resources other than QoS policies. The mechanism is described in: http://docs.openstack.org/developer/neutron/devref/rpc_callbacks.html

tags: added: rfe
Changed in neutron:
importance: Undecided → Wishlist
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

To be discussed at the drivers meeting.

Changed in neutron:
status: Confirmed → Triaged
Henry Gessau (gessau)
summary: - Made neutron agents silent by using AMQP
+ [RFE] Make neutron agents less chatty by using AMQP
summary: - [RFE] Make neutron agents less chatty by using AMQP
+ [RFE] Make neutron agents less chatty on AMQP
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Based on discussion [1], both [2] aim at tackling the same pain point. The proposed approaches may be different though.

[1] http://eavesdrop.openstack.org/meetings/neutron_drivers/2015/neutron_drivers.2015-12-01-15.00.log.html
[2] https://bugs.launchpad.net/neutron/+bug/1516195

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.