Comment 38 for bug 1771662

Revision history for this message
Frode Nordahl (fnordahl) wrote : Re: libvirtError: Node device not found: no node device with matching name

1) The 'No compute node record for host phanpy: ComputeHostNotFound_Remote: Compute host phanpy could not be found.' message is benign, this message appears on first start of the `nova-compute` service. It keeps appearing in the log here due to failure to register available resources. See 3)

2) Technically, the compute hosts are partially registered with `nova`:
$ nova service-list
+----+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+----+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| 1 | nova-conductor | juju-302a0a-2-lxd-2 | internal | enabled | up | 2018-06-29T10:28:00.000000 | - |
| 14 | nova-scheduler | juju-302a0a-2-lxd-2 | internal | enabled | up | 2018-06-29T10:28:01.000000 | - |
| 15 | nova-compute | phanpy | nova | enabled | up | 2018-06-29T10:28:01.000000 | - |
| 16 | nova-compute | aurorus | nova | enabled | up | 2018-06-29T10:28:05.000000 | - |
| 26 | nova-compute | zygarde | nova | enabled | up | 2018-06-29T10:28:05.000000 | - |
+----+----------------+---------------------+----------+---------+-------+----------------------------+-----------------+

3) However the compute hosts does not have any resources. The reason for no resources appearing in `nova` is that `nova-compute` service hits a TraceBack during initial host registration:

2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager Traceback (most recent call last):
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 7277, in update_available_resource_for_node
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 664, in update_available_resource
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 6438, in get_available_resource
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager self._get_pci_passthrough_devices()
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5945, in _get_pci_passthrough_devices
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager pci_info.append(self._get_pcidev_info(name))
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5906, in _get_pcidev_info
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager device.update(_get_device_capabilities(device, address))
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5877, in _get_device_capabilities
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager pcinet_info = self._get_pcinet_info(address)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 5820, in _get_pcinet_info
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager virtdev = self._host.device_lookup_by_name(devname)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/host.py", line 838, in device_lookup_by_name
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager return self.get_connection().nodeDeviceLookupByName(name)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager result = proxy_call(self._autowrap, f, *args, **kwargs)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager rv = execute(f, *args, **kwargs)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager six.reraise(c, e, tb)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager rv = meth(*args, **kwargs)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/libvirt.py", line 4177, in nodeDeviceLookupByName
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self)
2018-06-29 06:25:57.161 35528 ERROR nova.compute.manager libvirtError: Node device not found: no node device with matching name 'net_enP2p1s0f2_02_0f_b7_00_00_01'

As both the above analysis and the referenced mailing list thread suggest, the hardware/driver in question has a peculiar operating mode in that it is not possible to query which physical device a virtual function belongs to:

$ sudo virsh nodedev-list|grep net
<empty>

$ grep libvirtd /var/log/syslog|head -20
Jun 29 08:07:00 phanpy libvirtd[2487]: 2018-06-29 08:07:00.616+0000: 2487: error : virNetSocketReadWire:1811 : End of file while reading data: Input/output error
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.946+0000: 110362: info : libvirt version: 4.0.0, package: 1ubuntu8.2 (Marc Deslauriers <email address hidden> Wed, 23 May 2018 13:23:01 -0400)
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.946+0000: 110362: info : hostname: phanpy
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.946+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.946+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.958+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.958+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.973+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:37 phanpy libvirtd[110330]: 2018-06-29 08:18:37.974+0000: 110362: error : virCPUGetHost:457 : this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture
Jun 29 08:18:38 phanpy libvirtd[110330]: 2018-06-29 08:18:38.189+0000: 110368: error : virNetDevGetPhysicalFunction:1434 : internal error: The PF device for VF enP2p1s0f1 has no network device name
Jun 29 08:18:38 phanpy libvirtd[110330]: 2018-06-29 08:18:38.193+0000: 110368: error : virNetDevGetPhysicalFunction:1434 : internal error: The PF device for VF enP2p1s0f2 has no network device name
Jun 29 08:18:38 phanpy libvirtd[110330]: 2018-06-29 08:18:38.197+0000: 110368: error : virNetDevGetPhysicalFunction:1434 : internal error: The PF device for VF enP2p1s0f3 has no network device name

I have collected a run of libvirtd with debugging enabled and will attach a log from that.

The issue has its roots in hardware/driver operating peculiar or differently than both libvirt and Nova would expect wrt. not being able to query which PF a VF belongs to. What is still unclear to me is how this would be different on Xenial compared to Bionic.

Concluding, I would suggest that the `libvirt` driver in `nova` itself could handle this differently by extending the exception check in `_get_pci_passthrough_devices()` in `nova/virt/libvirt/driver.py`.