Description
===========
nova-live-migration fails due to the following error on the src during a migration test:
https://zuul.opendev.org/t/openstack/build/76c37d93ba274cd6b90fbe0659255959/log/compute1/logs/screen-n-cpu.txt
Jan 18 12:03:25.145915 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: ERROR nova.virt.libvirt.driver [-] [instance: 9b9d2344-6c8e-4dba-a8f9-60f86e52c854] Migration operation has aborted
Jan 18 12:03:25.194466 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: INFO nova.compute.manager [-] [instance: 9b9d2344-6c8e-4dba-a8f9-60f86e52c854] Swapping old allocation on dict_keys(['022fd821-f027-4019-ac4b-cb1e1d2d5423']) held by migration 7b332f03-b32e-4cc8-8e2d-23ed7bee03bd for instance
Jan 18 12:03:25.239903 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: ERROR nova.virt.libvirt.driver [-] [instance: 9b9d2344-6c8e-4dba-a8f9-60f86e52c854] Live Migration failure: unable to connect to server at 'ubuntu-focal-limestone-regionone-0022589715:49152': Connection refused: libvirt.libvirtError: unable to connect to server at 'ubuntu-focal-limestone-regionone-0022589715:49152': Connection refused
Jan 18 12:03:25.240119 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: DEBUG nova.virt.libvirt.driver [-] [instance: 9b9d2344-6c8e-4dba-a8f9-60f86e52c854] Migration operation thread notification {{(pid=52916) thread_finished /opt/stack/nova/nova/virt/libvirt/driver.py:9448}}
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: Traceback (most recent call last):
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/hubs/hub.py", line 476, in fire_timers
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: timer()
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/hubs/timer.py", line 59, in __call__
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: cb(*args, **kw)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/event.py", line 175, in _do_send
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: waiter.switch(result)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/greenthread.py", line 221, in main
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: result = function(*args, **kwargs)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/opt/stack/nova/nova/utils.py", line 660, in context_wrapper
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: return func(*args, **kwargs)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 9103, in _live_migration_operation
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: LOG.error("Live Migration failure: %s", e, instance=instance)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: self.force_reraise()
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: six.reraise(self.type_, self.value, self.tb)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/six.py", line 703, in reraise
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: raise value
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 9091, in _live_migration_operation
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: guest.migrate(self._live_migration_uri(dest),
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 649, in migrate
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: self._domain.migrateToURI3(
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 190, in doit
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: result = proxy_call(self._autowrap, f, *args, **kwargs)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 148, in proxy_call
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: rv = execute(f, *args, **kwargs)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 129, in execute
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: six.reraise(c, e, tb)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/six.py", line 703, in reraise
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: raise value
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/eventlet/tpool.py", line 83, in tworker
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: rv = meth(*args, **kwargs)
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: File "/usr/local/lib/python3.8/dist-packages/libvirt.py", line 2099, in migrateToURI3
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: raise libvirtError('virDomainMigrateToURI3() failed')
Jan 18 12:03:25.251131 ubuntu-focal-limestone-regionone-0022589731 nova-compute[52916]: libvirt.libvirtError: unable to connect to server at 'ubuntu-focal-limestone-regionone-0022589715:49152': Connection refused
Steps to reproduce
==================
$ cat queries/libvirt.yaml
query: >-
message:"libvirt.libvirtError: unable to connect to server at" AND
tags:screen-n-cpu.txt
$ elastic-recheck-query queries/libvirt.yaml
total hits: 12
build_branch
83% master
16% stable/ussuri
build_change
16% 711625
16% 743231
16% 754702
16% 764401
16% 770547
build_hostids
16% 13cc9efd4550bb9c731b89fae9c27cc77058360313dea9eebe140cac 1ccf2ea621af4ba5d658324f8716441c624749f7847700ac6aa69490
16% 322bfced1bb62c11480e9c8cf97e066ab7cb1067b23b38e8004c6f69 23320365a917d59498b30bb8308684708801795984a6ca42b1cd06b7
16% 422fc847b99ec49a4a48de150fd4c453e34d4b85847603a3435e1511 e1ef7b86aaeb7f61d567417ced5d641b20cff65c4c36e91a6173705f
16% 77681beb0ddf7a44597e1e8fa6f648861b5edf5445510df6083cdb72 c4902db85fd9c45a1c8b61c8e612d117a8a8f2393128fd1cae285326
16% 82ed50455b38e6ed7725b8020a977c59f2aaa9db51f018ed12e648b2 3a893be3caac821988438433df3cb21c389ffed93878670819a924f4 24505d497fe40296a37e97a5a1aa1caf33a2c685b883fb60eabf95d5
build_name
16% devstack-plugin-ceph-multinode-tempest-py3
16% neutron-ovn-tempest-full-multinode-ovs-master
16% neutron-tempest-dvr-ha-multinode-full
16% nova-live-migration
16% nova-next
build_node
83% ubuntu-focal
16% ubuntu-bionic
build_queue
100% check
build_status
100% FAILURE
build_zuul_url
100% N/A
filename
100% compute1/logs/screen-n-cpu.txt
log_url
16% https://7b4153b100b6f570d4fc-1ca4271300c9fb8c2811a73921fc79fe.ssl.cf2.rackcdn.com/764401/14/check/neutron-tempest-dvr-ha-multinode-full/d253afd/compute1/logs/screen-n-cpu.txt
16% https://b0f3338313a78e19953a-f09976335bb1c2233a92ccf49f573637.ssl.cf5.rackcdn.com/754702/5/check/nova-live-migration/76c37d9/compute1/logs/screen-n-cpu.txt
16% https://cfb2c6e30c49a7b551c1-e4b92f5e1fb788c5da82bf1447ffa549.ssl.cf2.rackcdn.com/711625/27/check/devstack-plugin-ceph-multinode-tempest-py3/677c92c/compute1/logs/screen-n-cpu.txt
16% https://e7591f9adbfbe7dc24c4-cd07d3426af01024dbeaac2f8639d2e0.ssl.cf2.rackcdn.com/743231/9/check/nova-next/cbe27d5/compute1/logs/screen-n-cpu.txt
16% https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_09a/770547/1/check/tempest-multinode-full-py3/09a5f18/compute1/logs/screen-n-cpu.txt
loglevel
100% ERROR
module
100% nova.virt.libvirt.driver
node_provider
33% ovh-bhs1
16% airship-kna1
16% limestone-regionone
16% rax-ord
16% vexxhost-ca-ymq-1
port
16% 39628
16% 41358
16% 46144
16% 52362
16% 58212
project
50% openstack/neutron
33% openstack/nova
16% openstack/devstack-plugin-ceph
syslog_pid
16% 19491
16% 46189
16% 46724
16% 52916
16% 53510
syslog_program
16% ubuntu-bionic-ovh-bhs1-0022571568 nova-compute
16% ubuntu-focal-airship-kna1-0022565736 nova-compute
16% ubuntu-focal-limestone-regionone-0022589731 nova-compute
16% ubuntu-focal-ovh-bhs1-0022599932 nova-compute
16% ubuntu-focal-rax-ord-0022553677 nova-compute
tags
50% screen-n-cpu.txt screen oslofmt
50% screen-n-cpu.txt screen oslofmt _grokparsefailure
voting
50% 0
50% 1
zuul_attempts
83% 1
16% 2
zuul_executor
33% ze06.openstack.org
33% ze11.openstack.org
16% ze04.openstack.org
16% ze09.openstack.org
Expected result
===============
No connection issues with libvirtd.
Actual result
=============
Connection issues with libvirtd cause CI runs to fail.
Environment
===========
1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/
As above, master and stable/ussuri appear impacted thus far.
2. Which hypervisor did you use?
(For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
What's the version of that?
libvirt
2. Which storage type did you use?
(For example: Ceph, LVM, GPFS, ...)
What's the version of that?
N/A
3. Which networking type did you use?
(For example: nova-network, Neutron with OpenVSwitch, ...)
N/A
Logs & Configs
==============
Lee on IRC mentioned he hit something similar today again; copy/pasting his comment from this review[1]:
Weird one, n-cpu (well the libvirt module) on the source (compute1) reports that libvirtd on the dest (controller) has refused the connection but the migration had already failed on the dest.
This smells like a python-libvirt bug tbh, surely the migrateToURI3 call on the source should be able to detect the failure on the dest and return a useful error to the caller?
------- ------- ------- ------- ------- ------- ------- ------- ------- ------- - /zuul.opendev. org/t/openstack /build/ f3b829801901417 c9310ad5cc5a0e8 86/log/ compute1/ logs/screen- n-cpu.txt# 11964
https:/
Jun 10 06:41:59.477301 ubuntu- focal-rax- dfw-0025050045 nova-compute[ 67549]: DEBUG nova.virt. libvirt. driver [-] [instance: 0a929010- 7361-4372- a785-f9eb052aca 46] Migration operation thread notification {{(pid=67549) thread_finished /opt/stack/ nova/nova/ virt/libvirt/ driver. py:9921} } focal-rax- dfw-0025050045 nova-compute[ 67549]: Traceback (most recent call last): focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ eventlet/ hubs/hub. py", line 476, in fire_timers focal-rax- dfw-0025050045 nova-compute[ 67549]: timer() focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ eventlet/ hubs/timer. py", line 59, in __call__ focal-rax- dfw-0025050045 nova-compute[ 67549]: cb(*args, **kw) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ eventlet/ event.py" , line 175, in _do_send focal-rax- dfw-0025050045 nova-compute[ 67549]: waiter. switch( result) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ eventlet/ greenthread. py", line 221, in main focal-rax- dfw-0025050045 nova-compute[ 67549]: result = function(*args, **kwargs) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/opt/stack/ nova/nova/ utils.py" , line 654, in context_wrapper focal-rax- dfw-0025050045 nova-compute[ 67549]: return func(*args, **kwargs) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/opt/stack/ nova/nova/ virt/libvirt/ driver. py", line 9576, in _live_migration _operation focal-rax- dfw-0025050045 nova-compute[ 67549]: LOG.error("Live Migration failure: %s", e, instance=instance) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ oslo_utils/ excutils. py", line 227, in __exit__ focal-rax- dfw-0025050045 nova-compute[ 67549]: self.force_ reraise( ) focal-rax- dfw-0025050045 nova-compute[ 67549]: File "/usr/local/ lib/python3. 8/dist- packages/ oslo_utils/ excutils. py", line 200, in force_reraise focal-rax- dfw-0025050045 ...
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-
Jun 10 06:41:59.486684 ubuntu-