Comment 28 for bug 1597596

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/520248
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3a503a8f2b934f19049531c5c92130ca7cdd6a7f
Submitter: Zuul
Branch: master

commit 3a503a8f2b934f19049531c5c92130ca7cdd6a7f
Author: Matt Riedemann <email address hidden>
Date: Wed Nov 15 19:15:44 2017 -0500

    Always deallocate networking before reschedule if using Neutron

    When a server build fails on a selected compute host, the compute
    service will cast to conductor which calls the scheduler to select
    another host to attempt the build if retries are not exhausted.

    With commit 08d24b733ee9f4da44bfbb8d6d3914924a41ccdc, if retries
    are exhausted or the scheduler raises NoValidHost, conductor will
    deallocate networking for the instance. In the case of neutron, this
    means unbinding any ports that the user provided with the server
    create request and deleting any ports that nova-compute created during
    the allocate_for_instance() operation during server build.

    When an instance is deleted, it's networking is deallocated in the same
    way - unbind pre-existing ports, delete ports that nova created.

    The problem is when rescheduling from a failed host, if we successfully
    reschedule and build on a secondary host, any ports created from the
    original host are not cleaned up until the instance is deleted. For
    Ironic or SR-IOV ports, those are always deallocated.

    The ComputeDriver.deallocate_networks_on_reschedule() method defaults
    to False just so that the Ironic driver could override it, but really
    we should always cleanup neutron ports before rescheduling.

    Looking over bug report history, there are some mentions of different
    networking backends handling reschedules with multiple ports differently,
    in that sometimes it works and sometimes it fails. Regardless of the
    networking backend, however, we are at worst taking up port quota for
    the tenant for ports that will not be bound to whatever host the instance
    ends up on.

    There could also be legacy reasons for this behavior with nova-network,
    so that is side-stepped here by just restricting this check to whether
    or not neutron is being used. When we eventually remove nova-network we
    can then also remove the deallocate_networks_on_reschedule() method and
    SR-IOV check.

    Change-Id: Ib2abf73166598ff14fce4e935efe15eeea0d4f7d
    Closes-Bug: #1597596