Novaclient retries after 504 timeout

Bug #1039696 reported by Mark Gius
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
python-novaclient
Invalid
High
Unassigned

Bug Description

This is strictly the fault of the underlying httplib2 library, but it causes bizarre behavior from python-novaclient, such as spawning 2X as many instances as were originally requested, and should be tracked upstream and fixed in python-novaclient once an appropriate upstream version exists.

We have nova-api configured behind HAProxy as a load balancer. HAProxy is currently configured to timeout requests after 50 seconds. Due to existing issues with Nova being really slow on database writes, if I request a large number of instances through the client (say, 80) then HAProxy will timeout the connection. However, the default behavior for httplib2 is to _IGNORE_ this timeout and the Connection: close it provides, and retry the request. When it does this, it re-establishes the connection from scratch so HAProxy sends the request to a new nova-api endpoint, which then also times out. In both requests, the request to spawn the instances eventually succeeds, and all 160 instances (2X the original request) come up fine.

It gets better. The value that controls the retry is hardcoded to 2 and not configurable in any released version of httplib2, although there appears to be a configurable fix in the trunk:

http://code.google.com/p/httplib2/issues/detail?id=124

See line 1227 or so.

http://code.google.com/p/httplib2/source/diff?spec=svna0d2b4b2f1357659bbdf9d257a053ca82ac49272&r=a0d2b4b2f1357659bbdf9d257a053ca82ac49272&format=side&path=/python2/httplib2/__init__.py

HTTP POST request from python-novaclient (through Horizon dashboard) with HAProxy generated Timeout with Connection: close (that is summarily ignored by httplib2).

POST /v2/931f782cf7b0461397001327790b8614/servers HTTP/1.1
Host: proxy:8774
Content-Length: 179
x-auth-project-id: 931f782cf7b0461397001327790b8614
accept-encoding: gzip, deflate
accept: application/json
x-auth-token: aa3bbff234ab4ab8a4b8ef7059cd600c
user-agent: python-novaclient
content-type: application/json

{"server": {"name": "test_fail", "imageRef": "9d352713-d90f-45b1-840f-98ec8ebef1e0", "flavorRef": "1", "max_count": 80, "min_count": 80, "security_groups": [{"name": "default"}]}}

HTTP/1.0 504 Gateway Time-out
Cache-Control: no-cache
Connection: close
Content-Type: text/html

<html><body><h1>504 Gateway Time-out</h1>
The server didn't respond in time.
</body></html>

Revision history for this message
Mark Gius (markgius) wrote :

For anyone using HAProxy as we are, the HAProxy option "forceclose" can be used to rudely close the socket on httplib2, which avoids this bad behavior.

description: updated
description: updated
Andrew Laski (alaski)
Changed in python-novaclient:
importance: Undecided → High
Revision history for this message
Andrew Laski (alaski) wrote :

httplib2 now exposes httplib2.RETRIES to configure the number of retries. That should be set to 1 by novaclient.

Changed in python-novaclient:
status: New → Triaged
Changed in python-novaclient:
assignee: nobody → Abhishek Lahiri (aviostack)
assignee: Abhishek Lahiri (aviostack) → nobody
Revision history for this message
Abhishek Lahiri (aviostack) wrote :

Looks like novaclient client.py uses requests module now instead of httplib2 library.

https://review.openstack.org/#/c/18257/

Is this bug still relevant? Can you confirm it?

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Marking as invalid since we don't use httpclient2 anymore

Changed in python-novaclient:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.