All libvirt queries currently block the eventled compute workers
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
The following "known bug" has been identified for the Essex release:
All database queries currently block the eventled workers.
But all compute nodes using libvirt library (ex: KVM, LXC, ...) have the same kind of issue.
As libvirt-python library is not green-safe (C code) all requests to libvirt blocks all compute node greenthreads.
This is a major issue cause a libvirt call (start, reboot, suspend, resume, ...) takes a long time to perform (many seconds), and following requests from the API node have to be enqueued in RabbitMQ, hopefully.
VMWARE api don't have the same issue cause it use a HTTP API ("suds" library).
And XenAPI evade this issue by using a tpool (even if for my part mixing greenthreads and threads is not a best practice due to Python threads. All Python threads have same weight, hence new threads from tpool (10 by default) have the same chance to be elected as the main thread (the greenthread one)).
Maybe this bug should be assign for the Folsom release, and have to be identified as a "known issue" for the Essex release.
regards,
PS: this could be corrected using a "green" API call (HTTP request for example) between the compute node and a libvirt "manager", or using a tpool like XenAPI :( (worst solution)
There is already an option to use a threadpool for libvirt. The config option is:
libvirt_ nonblocking= true
There was one outstanding issue (bug 962840), which has now been fixed and proposed into stable/essex. It was not mentioned because of the outstanding issue and the relative lack of testing in production, so I would consider it experimental for now.