neutron agents are too aggressive under server load
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
High
|
Kevin Benton |
Bug Description
If a server operation takes long enough to trigger a timeout on an agent call to the server, the agent will just give up and issue a new call immediately. This pattern is pervasive throughout the agents and it leads to two issues:
First, if the server is busy and the requests take more than the timeout window to fulfill, the agent will just continually hammer the server with calls that are bound to fail until the server load is reduced enough to fulfill the query. If the load is a result of calls from agents, this leads to a stampeding effect where the server will be unable to fulfill requests until operator intervention.
Second, the server will build a backlog of call requests that makes the window of time to process a message smaller as the backlog grows. With enough clients making calls, the timeout threshold can be crossed before a call even starts to process. For example, if it takes the server 6 seconds to process a given call and the clients are configured with a 60 second timeout, 30 agents making the call simultaneously will result in a situation where 20 of the agents will never get a response. The first 10 will get their calls filled and the last 20 will end up in a loop where the server is just spending time replying to calls that are expired by the time it processes them.
See the push notification spec for a proposal to eliminate heavy agent calls: https:/
However, even with that spec, we need more intelligent handling of the cases where calls are required (e.g. initial sync) or where push notifications are too invasive to change from a call.
Changed in neutron: | |
assignee: | nobody → Kevin Benton (kevinbenton) |
milestone: | none → mitaka-rc1 |
Changed in neutron: | |
status: | New → In Progress |
Changed in neutron: | |
importance: | Undecided → High |
tags: | added: loadimpact |
Changed in neutron: | |
milestone: | mitaka-rc1 → newton-1 |
tags: | added: scale |
Related: https:/ /review. openstack. org/#/c/ 280595/