resume_state_on_host_boot fails on instances in error state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nova (Ubuntu) |
Expired
|
High
|
Unassigned |
Bug Description
After an unexpected host reboot, all the guests went away. I added
'--start_
up nova-compute. It started some instances but then died on:
2012-12-19 11:11:47 CRITICAL nova [-] Domain not found: no domain with matching name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova Traceback (most recent call last):
2012-12-19 11:11:47 TRACE nova File "/usr/bin/
2012-12-19 11:11:47 TRACE nova service.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova _launcher.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova service.wait()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova return self._exit_
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova return hubs.get_
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova return self.greenlet.
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova result = function(*args, **kwargs)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova server.start()
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova self.manager.
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova block_device_info)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova return f(*args, **kw)
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova block_device_
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova virt_dom = self._conn.
2012-12-19 11:11:47 TRACE nova File "/usr/lib/
2012-12-19 11:11:47 TRACE nova if ret is None:raise libvirtError(
2012-12-19 11:11:47 TRACE nova libvirtError: Domain not found: no domain with matching name 'instance-000000bb'
2012-12-19 11:11:47 TRACE nova
This instance is in an error state:
RESERVATION r-n1d0t747 c519923c921a404
INSTANCE i-000000bb ami-000000bf server-187 server-187 error None (c519923c921a40
And no longer exists on alce. I couldn't find any reasonable way to
kill the instance entirely (ec2-terminate-
had no affect) or trivially remove it from the database. I ended up
modifying the nova libvirt driver to skip instances it can't find with
the attached patch.
(FAOD, I'm attaching the patch mostly to illustrate the problem and
our workaround, not necessarily for use as is in the packages or
upstream.)
This is all with current Ubuntu 12.04 packages (including
precise-proposed).
Changed in nova (Ubuntu): | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in nova (Ubuntu): | |
status: | Confirmed → Incomplete |
The attachment "Skip instances which can't be found in hard_reboot" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.
[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]