All libvirt queries currently block the eventled compute workers

Bug #974369 reported by Bertrand Lallau
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

The following "known bug" has been identified for the Essex release:
All database queries currently block the eventled workers.

But all compute nodes using libvirt library (ex: KVM, LXC, ...) have the same kind of issue.
As libvirt-python library is not green-safe (C code) all requests to libvirt blocks all compute node greenthreads.
This is a major issue cause a libvirt call (start, reboot, suspend, resume, ...) takes a long time to perform (many seconds), and following requests from the API node have to be enqueued in RabbitMQ, hopefully.

VMWARE api don't have the same issue cause it use a HTTP API ("suds" library).
And XenAPI evade this issue by using a tpool (even if for my part mixing greenthreads and threads is not a best practice due to Python threads. All Python threads have same weight, hence new threads from tpool (10 by default) have the same chance to be elected as the main thread (the greenthread one)).

Maybe this bug should be assign for the Folsom release, and have to be identified as a "known issue" for the Essex release.

regards,

PS: this could be corrected using a "green" API call (HTTP request for example) between the compute node and a libvirt "manager", or using a tpool like XenAPI :( (worst solution)

Revision history for this message
Vish Ishaya (vishvananda) wrote :

There is already an option to use a threadpool for libvirt. The config option is:

libvirt_nonblocking=true

There was one outstanding issue (bug 962840), which has now been fixed and proposed into stable/essex. It was not mentioned because of the outstanding issue and the relative lack of testing in production, so I would consider it experimental for now.

Changed in nova:
status: New → Invalid
Revision history for this message
yong sheng gong (gongysh) wrote :

Hi Vish,
I cannot see any difference between patch_tpool_proxy patched and unpatched mentioned in https://review.openstack.org/6309 when I run your test case:
./run_tests.sh -N nova.tests.test_libvirt.LibvirtNonblockingTestCase.

Could you tell me how to check the difference with your test case?

my libvirtd version is 0.9.6

Revision history for this message
Vish Ishaya (vishvananda) wrote : Re: [Bug 974369] Re: All libvirt queries currently block the eventled compute workers

The code is not mine. I just proposed a backport of the fix due to eventlet
and old style classes from trunk to stable/essex

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.