Comment 4 for bug 1921150

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/782553
Committed: https://opendev.org/openstack/neutron/commit/7f35e4e857f7c6e83c635125ce9b42df6e10a510
Submitter: "Zuul (22348)"
Branch: master

commit 7f35e4e857f7c6e83c635125ce9b42df6e10a510
Author: Bence Romsics <email address hidden>
Date: Tue Mar 23 14:07:36 2021 +0100

    Physical NIC RP should be child of agent RP

    In the fix for #1853840 I made a mistake and since then we created
    the physical NIC resource providers as a child of the hypervisor
    resource provider instead of the agent resource provider. Here:

    https://review.opendev.org/c/openstack/neutron/+/696600/3/neutron/agent/common/placement_report.py#159

    This *did not* break the minimum bandwidth aware scheduling.
    But still there are multiple problems:

    1) If you created your physical NIC RPs before the fix for #1853840
       but upgraded to after the fix for #1853840, then resource syncs
       will throw an error in neutron-server at each physical NIC RP
       update. That pollutes the logs and wastes some resources since
       the prohibited update will be forever retried.

    2) If you created your physical NIC RPs after the fix for #1853840
       then your physical NIC RPs have the wrong parent. Which again
       does not break minimum bandwidth aware scheduling. But it may pose
       problems for later features wanting to build on the originally
       planned RP tree structure.

    3) Cleanup of decommissioned RPs is a bit different than expected.
       This cleanup was always left to the admin, so it only affects a
       manual process.

    The proper RP structure was and should be the following:

    The hypervisor RP(s) must be the root(s).
    As a child of each hypervisor RP, there should be an agent RP.
    The physical NIC RPs should be the children of the agent RPs.

    Unfortunately at the moment the Placement API generically prohibits
    update of the parent resource provider id in a PUT request:

    https://docs.openstack.org/api-ref/placement/?expanded=update-resource-provider-detail#update-resource-provider

    Therefore without a later Placement change we cannot fix the RPs
    already created with the wrong parent. However we can fix the RPs
    to be created later. We do that here. We also fix a bug in the unit
    tests that allowed the wrong parent to pass unnoticed. Plus we
    add an extra log message to direct the user seeing the pollution
    in the logs to the proper bug report.

    There may be a follow up patch later, because not all RP re-parenting
    operations are problematic, therefore we are thinking of relaxing
    this blanket prohibition in Placement. When Placement allows updates
    to the parent id we can fix RPs already created with the wrong parent
    too.

    Change-Id: I7caa8827d22103600ca685a58294640fc831dbd9
    Closes-Bug: #1921150
    Co-Authored-By: "Balazs Gibizer" <email address hidden>
    Related-Bug: #1853840