Failed to recover stopped instance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Stein |
Fix Committed
|
High
|
Unassigned | ||
Train |
Fix Released
|
High
|
Unassigned | ||
Ussuri |
Fix Released
|
High
|
Unassigned | ||
Victoria |
Fix Released
|
High
|
Unassigned | ||
Wallaby |
Fix Released
|
Undecided
|
Unassigned | ||
masakari |
Fix Released
|
Undecided
|
takahara.kengo | ||
Train |
Fix Released
|
Undecided
|
Unassigned | ||
Ussuri |
Fix Released
|
Undecided
|
Unassigned | ||
Victoria |
Fix Released
|
Undecided
|
Unassigned | ||
Wallaby |
Fix Released
|
Undecided
|
takahara.kengo | ||
masakari (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Focal |
Fix Released
|
High
|
Unassigned | ||
Groovy |
Fix Released
|
High
|
Unassigned | ||
Hirsute |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Error]
Recovering host-failure was failed when there was stopped state instance on the failed host.
As a result, notification status became "failed".
(Instance's vm_state after evacuation became "stopped".)
I used the latest version of masakari.
[Cause of error]
Masakari will try to call stop API after evacuating.
But, evacuate API stops the instance at the end if the original vm_state is stopped.
So 409 error was occurred when masakari called stop API after evacuating.
== Ubuntu SRU Details below ==
[Impact]
See above
[Test Case]
For focal:
Test with an actual juju deployed masakari openstack deployment and ensure the reported bug is fixed on host failure.
For all other releases the fix can be verified with an LXD container for the corresponding release:
$ sudo apt install python3-masakari
$ cd /usr/lib/
$ python3 -m unittest masakari.
The unit test will be successful on a patched deployment and will fail with a mismatch error in test_host_
[Where problems coud occur]
Any regressions in this fix will likely result in similar failures to what was reported in this bug, resulting in a failure to recover an instance on host failure. The patch is a small, targeted change with a good unit test and the code is unchanged across the backports which helps mitigate regression potential.
description: | updated |
Changed in masakari: | |
assignee: | nobody → takahara.kengo (takahara.kengo) |
Changed in masakari: | |
status: | In Progress → Fix Committed |
Changed in masakari (Ubuntu Hirsute): | |
status: | New → Fix Released |
description: | updated |
Changed in masakari (Ubuntu Groovy): | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in masakari (Ubuntu Focal): | |
importance: | Undecided → High |
status: | New → Triaged |
description: | updated |
Posted a patch. /review. openstack. org/585625
https:/