Inconsistent state when connection to conductor is lost during live migration

Bug #1536589 reported by Radomir Dopieralski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Unassigned

Bug Description

If during live migration the connection to nova conductor service is somehow lost (for instance, due to the rabbitmq server becoming unavailable), the migration status of the nodes never gets updated, and they end up forever in "migrating" state, with the actual guest already running on the new host, but the data in the nova database still pointing at the old host.

This happens in all versions at lest up to Mitaka.

How to reproduce:
1. Create a simple setup with two hosts.
2. Create an instance and start a live migration.
3. Kill the rabbitmq server.
4. Wait for the migration to finish.
5. Bring the rabbitmq server back up.
6. Observe the instance stuck in "migrating" state, with everything migrated to the new host, but Nova thinking it's still on the old host.

description: updated
Changed in nova:
assignee: nobody → Radomir Dopieralski (thesheep)
summary: - Inconsistent state if connection to conductor is lost during live
+ Inconsistent state when connection to conductor is lost during live
migration
jichenjc (jichenjc)
tags: added: live-migration
Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

This looks very similar to https://bugs.launchpad.net/nova/+bug/1437154 and actually leads to the same issues. Doesnt matter whether you kill live migration monitor or rabbitmq, you will end up with an instance running on destination host without networking configured correctly and with a mess on source host.

I believe that https://review.openstack.org/#/c/225910/ should at least partially solve this problem. There is also proposition to make compute stateful which should solve the issue completely.

Sean Dague (sdague)
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Sarafraj Singh (sarafraj-singh) wrote :

Radomir,
Are you working on the fix? Please change status to Inprogress if you are, otherwise change Assigned to ->nobody.

Revision history for this message
Takashi Natsume (natsume-takashi) wrote :

A bug report with an assignee should be 'In Progress' status.
So set 'In Progress'.

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing
the status back to the previous state and unassigning. If
there are active reviews related to this bug, please include
links in comments.

Changed in nova:
status: In Progress → Confirmed
assignee: Radomir Dopieralski (deshipu) → nobody
Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.mitaka
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.