Failure to schedule if flavor contains non-CPU flag traits

Bug #1843836 reported by Stephen Finucane
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Stephen Finucane

Bug Description

I'm seeing the following error locally:

Sep 12 18:52:25 compute-small nova-conductor[28968]: ERROR nova.scheduler.utils [None req-b86b25c8-c89e-4449-bec3-c94948402e02 demo admin] [instance: a4056430-ed06-4cea-91b9-e15fd4b1979f] Error from last host: compute-small (node compute-small): [u'Traceback (most recent call last):\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2038, in _do_build_and_run_instance\n filter_properties, request_spec)\n', u' File "/opt/stack/nova/nova/compute/manager.py", line 2408, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance a4056430-ed06-4cea-91b9-e15fd4b1979f was re-scheduled: No CPU model match traits, models: ['IvyBridge-IBRS'], required flags: set([None])\n"]

This is affecting me when testing the 'PCPU' feature because we're rewriting the 'hw:cpu_thread_policy' to add a 'HW_CPU_HYPERTHREADING' trait, however, this can happen with any non-CPU flag trait (e.g. COMPUTE_SUPPORTS_MULTIATTACH) because of the following code:

https://github.com/openstack/nova/blob/7a18209a8/nova/virt/libvirt/utils.py#L600

That will mean we can return a set contains 'None', which causes this later check to fail:

https://github.com/openstack/nova/blob/7a18209a81539217a95ab7daad6bc67002768950/nova/virt/libvirt/driver.py#L4083

Since no CPU model will report a 'None' feature flag.

description: updated
Changed in nova:
assignee: nobody → Stephen Finucane (stephenfinucane)
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: train-rc-potential
Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in nova:
importance: Undecided → High
Revision history for this message
Matt Riedemann (mriedem) wrote :

And this was the change that introduced the regression: https://review.opendev.org/#/c/670298/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/681932
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6ec09de2435fd849ba8123d587cce34e1d8b5ec7
Submitter: Zuul
Branch: master

commit 6ec09de2435fd849ba8123d587cce34e1d8b5ec7
Author: Stephen Finucane <email address hidden>
Date: Thu Sep 12 20:57:15 2019 +0100

    libvirt: Correctly handle non-CPU flag traits

    The 'get_flags_by_flavor_specs' function is intended to return a list of
    CPU flags extracted from flavor extra spec traits, the idea being that
    you can request a specific CPU flag using traits. However, this looks
    through every trait in the image and uses 'dict.get' to try figure out
    if the trait is a CPU flag trait. 'dict.get' returns None if no match is
    found, so we can end up returning 'set([None])'. This isn't false'y,
    which means we end up calling '_match_cpu_model_by_flags' later on and
    *that* fails because 'set([None])' won't be a subset of any CPU model's
    set of flags (no CPU has a 'None' or null flag).

    The solution is easy - don't add the None values.

    Change-Id: I1468ad4b724b8d0e3a855c329bd8c8af513d986c
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1843836

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.