Cannot delete instance in ERROR status

Bug #1012551 reported by progresstudy
50
This bug affects 9 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Unassigned
nova (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

Hi

I think there is a bug of nova .
The situation is :

When I try to boot a instance first time , it report spawning error , The reason is My compute-node don't support kvm virtuallization , So I use qemu instead , Of course it work well after change . BUT when I try to delete the error instance , It cannot be deleted , stay at deleting .

I found the reason is libvirt cannot find the error instance nova created before , The libvirt raise error when it try to create instance But store instance info into db , . So when we want to delete the error instance , the instance cannot be found by libvirt , So it cannot be destroy of course .

Also found that , the Domain XML in two place .
One is in /var/lib/nova/instance/instance-xxxxx/libvirt.xml which maybe create by nova I think
The other is in /etc/libvirt/qemu/instance-xxxxx.xml which maybe create by libvirt I think

I found the error instance's domain config xml only exist in /var/lib/nova/instance/
The active instance's xml in both /var/lib/nova/instance/ and /etc/libvirt/qemu/

finally , I have to change the db manually so that this error instance record look like terminated . and it work .

My openstack version :
#nova-manage version
2012.1 (2012.1-LOCALBRANCH:LOCALREVISION)

Thank you

Tags: canonistack
Thierry Carrez (ttx)
summary: - cannot delete error instance
+ Cannot delete instance in ERROR status
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
John Tran (jtran) wrote :

I'm unable to reproduce the problem. I'm using Ubuntu 12.04 Precuse amd64, I had qemu and kvm installed and devstack on TRUNK. It supports both kvm and qemu, so I did 'apt-get purge kvm' to remove kvm support. I then verified nova.conf has 'libvirt_type=kvm' set, restarted nova-compute. nova boot instance attempt resulted in 'ERROR' when nova list , which I assume is desired. I then set nova.conf 'libvirt_type=qemu', restarted nova-compute. I was then able to nova delete vmname as well as nova boot another vm successfully. Therefore, was unable to reproduce the problem.

Revision history for this message
winter (mcweels) wrote :

I can confirm this error. It has nothing to do with KVM vs. qemu. It's as the OP suggested.

Revision history for this message
Mukul Jain (mukul-j) wrote :

I am encountering same error. Can not delete the VM stuck in error state. I suspect it has something to do with KVM qemu switch. I am on VM and was previously configured to use KVM. Tried to create instances and failer.. error message. Now I changed to qemu ( sine I am running it on VM). Now I am unable to delete some of the old instances, which I believe were created when I was configured to use KVM.

Also, another issue I am facing on this system is that as soon as I issue the boot command is caused some to crash and later my IP address for eth0 interface changed from IPv4 to an IPv6 address, terminal connection were terminated due to network connectivity issue.. Also

Revision history for this message
Doug Goldstein (cardoe) wrote :

You can reproduce this problem very quickly on a freshly installed Ubuntu 12.04 system with Folsom by following the user manual line for line due to bug #1064749

Revision history for this message
Doug Goldstein (cardoe) wrote :
Download full text (7.1 KiB)

Here's the debug info from Folsom.

$ nova --debug delete 3935bacf-63c5-4b19-9375-f067d75d56e6

REQ: curl -i http://192.168.200.135:35357/v2.0/tokens -X POST -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: python-novaclient" -d '{"auth": {"tenantName": "dev", "passwordCredentials": {"username": "admin", "password": "dev"}}}'

connect: (192.168.200.135, 35357)
send: 'POST /v2.0/tokens HTTP/1.1\r\nHost: 192.168.200.135:35357\r\nContent-Length: 102\r\ncontent-type: application/json\r\naccept-encoding: gzip, deflate\r\naccept: application/json\r\nuser-agent: python-novaclient\r\n\r\n{"auth": {"tenantName": "dev", "passwordCredentials": {"username": "admin", "password": "dev"}}}'
reply: 'HTTP/1.1 200 OK\r\n'
header: Vary: X-Auth-Token
header: Content-Type: application/json
header: Date: Tue, 09 Oct 2012 23:34:02 GMT
header: Transfer-Encoding: chunked
RESP:{'date': 'Tue, 09 Oct 2012 23:34:02 GMT', 'transfer-encoding': 'chunked', 'status': '200', 'content-type': 'application/json', 'vary': 'X-Auth-Token'} {"access": {"token": {"expires": "2012-10-10T23:34:01Z", "id": "9545eac2bd254984805be5c6f65a943a", "tenant": {"enabled": true, "description": "EA development", "name": "dev", "id": "5aa21f92483b441f89b844421b639f9c"}}, "serviceCatalog": [{"endpoints": [{"adminURL": "http://192.168.200.135:8774/v2/5aa21f92483b441f89b844421b639f9c", "region": "Developers", "internalURL": "http://192.168.200.135:8774/v2/5aa21f92483b441f89b844421b639f9c", "id": "da16351759e0471c928e313ecf1541cd", "publicURL": "http://192.168.200.135:8774/v2/5aa21f92483b441f89b844421b639f9c"}], "endpoints_links": [], "type": "compute", "name": "nova"}, {"endpoints": [{"adminURL": "http://192.168.200.135:9292/v1", "region": "Developers", "internalURL": "http://192.168.200.135:9292/v1", "id": "618b8e6304c1460fa0c71db565975420", "publicURL": "http://192.168.200.135:9292/v1"}], "endpoints_links": [], "type": "image", "name": "glance"}, {"endpoints": [{"adminURL": "http://192.168.200.135:8776/v1/5aa21f92483b441f89b844421b639f9c", "region": "Developers", "internalURL": "http://192.168.200.135:8776/v1/5aa21f92483b441f89b844421b639f9c", "id": "1889f0b8fefa4049852427e379b2e189", "publicURL": "http://192.168.200.135:8776/v1/5aa21f92483b441f89b844421b639f9c"}], "endpoints_links": [], "type": "volume", "name": "volume"}, {"endpoints": [{"adminURL": "http://192.168.200.135:8773/services/Admin", "region": "Developers", "internalURL": "http://192.168.200.135:8773/services/Cloud", "id": "171b68c50bbf4077869aa36a1daeda6b", "publicURL": "http://192.168.200.135:8773/services/Cloud"}], "endpoints_links": [], "type": "ec2", "name": "ec2"}, {"endpoints": [{"adminURL": "http://192.168.200.135:8888/v1", "region": "Developers", "internalURL": "http://192.168.200.135:8888/v1/AUTH_5aa21f92483b441f89b844421b639f9c", "id": "4e74115c6d004c9e8e70c0624a1b498a", "publicURL": "http://192.168.200.135:8888/v1/AUTH_5aa21f92483b441f89b844421b639f9c"}], "endpoints_links": [], "type": "object-store", "name": "swift"}, {"endpoints": [{"adminURL": "http://192.168.200.135:35357/v2.0", "region": "Developers", "internalURL": "http://192.168.200.135:5000/v2.0", "id": "f26bb12c650747efb3e3838b6da0bb...

Read more...

Revision history for this message
Doug Goldstein (cardoe) wrote :

As pointed out to me by ev0ldave in #openstack the way to fix this is to figure out exactly what the instance ID is.

$ mysql -u root -p
mysql> USE nova;
mysql> SELECT id, host, hostname FROM instances;
...
mysql> UPDATE instances SET deleted=1, deleted_at="2012-10-10 14:56:30", vm_state="deleted" WHERE id = 240;
...
mysql> quit;

$ sudo rm -rf /var/lib/nova/instances/instance-00000240

Solves the issue assuming your bad instance is ID 240. Rinse and repeat for each instance.

It'd be nice to programatically fix this.

James Page (james-page)
Changed in nova (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
James Troup (elmo)
tags: added: canonistack
Revision history for this message
Chuck Short (zulcss) wrote :

I am not able to reproduce this at all on Grizzly.

Changed in nova:
status: Confirmed → Fix Committed
Changed in nova (Ubuntu):
status: Triaged → Fix Released
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Revision history for this message
Abdi Ibrahim (abdi-w) wrote :

It's also happening in Grizzly release (nova).:
root@control01:/home/user1# nova delete vm1-acc68897-1155-45c8-b8e8-37e27a9899bd
HTTPConnectionPool(host='192.168.220.40', port=8774): Max retries exceeded with url: /v2/e54c7f63e9284138ad207559ca4a40e4/servers/acc68897-1155-45c8-b8e8-37e27a9899bd

and the VM is stuck in the error state or just stays in "Deleting" state...

where is this fixed? in the nova-client code or in Horizon?

Thank you,
Abdi

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.