Tests hang frequently in XenAPIVMTestCase.test_parallel_builds

Bug #831599 reported by Soren Hansen
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Unassigned

Bug Description

I can kill it, start over, and then it usually works.

It seems to be busy-waiting on something . strace shows a bunch of threads waiting on futexes and a single one calling epoll_wait a *lot*. I'm attaching the last 1000 lines of run_tests_log.

Related branches

Revision history for this message
Soren Hansen (soren) wrote :
Revision history for this message
Alex Meade (alex-meade) wrote :

Seems to me that this is when the test should be failing, however it doesn't assert anything and just hangs instead. Perhaps just add a timeout where it's calling .wait() and fail if it doesn't in time? Seems risky to me, and that also means we need to make it work.

I say nuke the test!

Thierry Carrez (ttx)
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Brian Lamar (blamar) wrote :

Adding eventlet.monkey_patch() to the top of the test file seemed to squash this for me, but I'm hesitant to say definitively because it's not consistently reproducible.

Revision history for this message
Brian Lamar (blamar) wrote :

Nevermind, I was just getting lucky.

Revision history for this message
Ewan Mellor (ewanmellor) wrote :

I bet it's this (a similar backtrace is in Soren's log):

2011-08-28 11:01:52,770 DEBUG nova.virt.xenapi.fake [-] Calling VM.start <bound
method FakeSessionForVMTests.VM_start of <nova.tests.xenapi.stubs.FakeSessionFor
VMTests object at 0x1077cb4c>> from (pid=9342) callit /opt/jenkins/workspace/uni
t-nova/upstream/nova/virt/xenapi/fake.py:431
2011-08-28 11:01:52,772 DEBUG nova.virt.xenapi.vmops [-] Starting instance 2 fro
m (pid=9342) _start /opt/jenkins/workspace/unit-nova/upstream/nova/virt/xenapi/v
mops.py:134
Traceback (most recent call last):
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/hubs/hub.py", line 336, in fire_timers
    timer()
2011-08-28 11:01:52,773 DEBUG nova.virt.xenapi.fake [-] Calling VM.start <bound
method FakeSessionForVMTests.VM_start of <nova.tests.xenapi.stubs.FakeSessionFor
VMTests object at 0x1077cb4c>> from (pid=9342) callit /opt/jenkins/workspace/uni
t-nova/upstream/nova/virt/xenapi/fake.py:431
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "/opt/jenkins/workspace/unit-nova/upstream/.nova-venv/lib/python2.6/site-
packages/eventlet/semaphore.py", line 95, in _do_acquire
    waiter.switch()
error: cannot switch to a different thread

Revision history for this message
Ewan Mellor (ewanmellor) wrote :

The traceback from Soren's log:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 336, in fire_timers
2011-08-22 21:58:29,483 DEBUG nova.virt.xenapi.fake [-] Calling host.call_plugin <bound method FakeSessionForVMTests.host_call_plugin of <nova.tests.xenapi.stubs.FakeSessionForVMTests object at 0x1365f490>> from (pid=4594) callit /home/soren/src/openstack/nova/virt-layer-cleanup2/nova/virt/xenapi/fake.py:428
    timer()
  File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/eventlet/semaphore.py", line 95, in _do_acquire
    waiter.switch()
error: cannot switch to a different thread

Changed in nova:
status: Confirmed → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → diablo-rbp
Thierry Carrez (ttx)
Changed in nova:
milestone: diablo-rbp → 2011.3
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.