After change Host's Availability Zone, the deployed VMs Availability Zone data(in novaapi.request_specs table) can not be updated

Bug #1907775 reported by wanghaojue
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Stephen Finucane

Bug Description

Description
===========

After change Host's Availability Zone, the deployed VMs Availability Zone data(in request_specs table) can not be updated

Steps to reproduce
==================
1.Create host aggregates block_az, add two hosts into this host aggregates, also set availability_zone to block_az.
2.Deploy a VM, named: rhel82-official-block-az
3.Now remove 2 hosts from block_az host aggregates, these 2 hosts are in "Default_Group" AZ
4. openstack server show rhel82-official-block-az, the availability_zone is correct
+-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | Default_Group |
| OS-EXT-SRV-ATTR:host | kvmperf2.icic.boe |
| OS-EXT-SRV-ATTR:hostname | rhel82-official-block-az |
| OS-EXT-SRV-ATTR:hypervisor_hostname | kvmperf2.icic.boe

5.From novaapi.request_specs table, the AZ is still the original "block_az" and is incorrect.

MariaDB [novaapi]> select * from request_specs where instance_uuid='5f4c292b-30b4-4bce-b204-5304306248cb';

 "availability_zone": "block_az", "flavor": .....

Expected result
===============
 "availability_zone" in novaapi.request_specs table can be updated when vm's host is changed.

Actual result
=============
 "availability_zone" in novaapi.request_specs table is updated when vm's host is changed.

Environment
===========
Ussuri release

wanghaojue (wghaojue)
summary: After change Host's Availability Zone, the deployed VMs Availability
- Zone data(in request_specs table) can not be updated
+ Zone data(in novaapi.request_specs table) can not be updated
Revision history for this message
Stephen Finucane (stephenfinucane) wrote :

This isn't something you should do, but we should prevent you doing so in the API. In general, making changes to the aggregate a host belongs to should not be done while there are instances on the host. It's for a similar reason that we prevent you making changes to the AZ information of an aggregate when there are instances in any of the aggregate's hosts.

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
tags: added: availability-zones scheduler
tags: removed: scheduler
Changed in nova:
assignee: nobody → Stephen Finucane (stephenfinucane)
Revision history for this message
Stephen Finucane (stephenfinucane) wrote :
Revision history for this message
wanghaojue (wghaojue) wrote :

Stephen,

Thank you for the prompt reply and solution! I noted two testcases are added, does it mean that the fix will be available in the next several days? Thanks!

Revision history for this message
Stephen Finucane (stephenfinucane) wrote :

There's no ETA on a fix, I'm afraid. The test case I've proposed proves the issue but does not attempt a fix yet. If you have time and ideas on how to fix this, I'd be happy to review them. If not, hopefully I'll get to it eventually.

Revision history for this message
wanghaojue (wghaojue) wrote :

Stephen,

Thanks for your clarification! I didn't get around to it for the last several days, I tried it today and have the fix already,I'll submit it for your review.Thanks!

Revision history for this message
wanghaojue (wghaojue) wrote :

Stephen,
Please help review it, thanks!
https://review.opendev.org/c/openstack/nova/+/768141

Revision history for this message
Brin Zhang (zhangbailin) wrote :

I think we should update the old server's avaialable_zone, if the host removed from the host aggregates, because this is an common use case.
For some need, users may need to move an host aggregates's hosts to another host aggregates and there will be a new AZ, IMO, if we only restrict the removal of hosts with VMs, this will not solve the problem fundamentally. From another perspective, this will also bring great inconvenience to users.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/821423

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
wanghaojue (wghaojue) wrote :

Hello Brin,

I have not checked the issue for a long time... In https://review.opendev.org/c/openstack/nova/+/821423, it doesn't not change VM's AZ to the new one, right? It still keep Stephen's proposed behavior that restricts the removal of hosts with VMs, right?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/766771
Committed: https://opendev.org/openstack/nova/commit/5bb6d4c18803fc2e70ce3e45459dd0f900e863cd
Submitter: "Zuul (22348)"
Branch: master

commit 5bb6d4c18803fc2e70ce3e45459dd0f900e863cd
Author: Stephen Finucane <email address hidden>
Date: Fri Dec 11 17:53:22 2020 +0000

    functional: Add reproducer for #1907775

    You can currently remove a host that has instances scheduled to it from
    an aggregate. If the aggregate is configured as part of an availability
    zone (AZ), this would in turn remove the host from the AZ, leaving
    instances originally scheduled to that AZ stranded on a host that is no
    longer a member of the AZ. This is clearly undesirable and should be
    blocked at the API level.

    You can also add a host to an aggregate where it wasn't in one before.
    Because nova provides a default AZ for hosts that don't belong to an
    aggregate, adding a host to an aggregate doesn't just assign it to an
    AZ, it removes it from the default 'nova' one (or whatever you've
    configured via '[DEFAULT] default_availability_zone'). As noted in the
    docs [1], people should not rely on scheduling to the default AZ, but if
    they had, we'd end up in the same situation as above.

    Add tests for both, with a fix coming after.

    [1] https://docs.openstack.org/nova/latest/admin/availability-zones.html

    Change-Id: I21f7f93ee0ec0cd3a290afba59342b31d074cf2f
    Signed-off-by: Stephen Finucane <email address hidden>
    Related-Bug: #1907775

Revision history for this message
wangwei (ww2000e1) wrote :

What is the situation now? I see that the two solutions are not merged?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by "sean mooney <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/768141
Reason: closing this in favor of https://review.opendev.org/c/openstack/nova/+/821423

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "sean mooney <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/768802
Reason: closing in favor of https://review.opendev.org/c/openstack/nova/+/821423

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/821423
Committed: https://opendev.org/openstack/nova/commit/3c0eadae0b9ec48586087ea6c0c4e9176f0aa3bc
Submitter: "Zuul (22348)"
Branch: master

commit 3c0eadae0b9ec48586087ea6c0c4e9176f0aa3bc
Author: Balazs Gibizer <email address hidden>
Date: Fri Dec 10 17:53:30 2021 +0100

    Reject AZ changes during aggregate add / remove host

    After this patch nova rejects the add host to aggregate API action
    if the host has instances and the new aggregate for the host would
    mean that these instances need to move from one AZ (even from the
    default one) to another. Such AZ change is not implemented in nova
    and currently leads to stuck instances.

    Similarly nova will reject remove host from aggregate API action if the
    host has instances and the aggregate removal would mean that the
    instances need to change AZ.

    Depends-On: https://review.opendev.org/c/openstack/tempest/+/821732

    Change-Id: I19c4c6d34aa2cc1f32d81e8c1a52762fa3a18580
    Closes-Bug: #1907775

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/2024.1)

Fix proposed to branch: stable/2024.1
Review: https://review.opendev.org/c/openstack/nova/+/918650

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/2024.1)

Reviewed: https://review.opendev.org/c/openstack/nova/+/918650
Committed: https://opendev.org/openstack/nova/commit/994358d582e8e8d98a3e8e91ceb77ae354c1e275
Submitter: "Zuul (22348)"
Branch: stable/2024.1

commit 994358d582e8e8d98a3e8e91ceb77ae354c1e275
Author: Balazs Gibizer <email address hidden>
Date: Fri Dec 10 17:53:30 2021 +0100

    Reject AZ changes during aggregate add / remove host

    After this patch nova rejects the add host to aggregate API action
    if the host has instances and the new aggregate for the host would
    mean that these instances need to move from one AZ (even from the
    default one) to another. Such AZ change is not implemented in nova
    and currently leads to stuck instances.

    Similarly nova will reject remove host from aggregate API action if the
    host has instances and the aggregate removal would mean that the
    instances need to change AZ.

    Depends-On: https://review.opendev.org/c/openstack/tempest/+/821732

    Change-Id: I19c4c6d34aa2cc1f32d81e8c1a52762fa3a18580
    Closes-Bug: #1907775
    (cherry picked from commit 3c0eadae0b9ec48586087ea6c0c4e9176f0aa3bc)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.