Comment 7 for bug 1897424

Revision history for this message
Lee Yarwood (lyarwood) wrote :

Right, apologies I forgot that reserve_block_device_name also takes an instance.uuid lock on the compute [1] so we can end up in a situation where an initial request to attach a volume holds the lock during attach_volume [2] while a second request is waiting at the instance.uuid lock in reserve_block_device_name allowing the RPC call to timeout back on n-api.

An instance task state does sound like a good idea to avoid this as we can't move the instance.uuid lock over to n-api to cover both the reserve and attach as it would delay the actual API response for too long IMHO.

An alternative would be to drop the RPC call to reserve_block_device_name entirely as the returned device isn't guaranteed to be the actual device that ends up being presented within the guestOS anyway by most hypervisors (namely libvirt). We could always create the bdm in the API and update the device once attached on the compute.

IMHO either approach is going to require a spec to hammer out the details.

[1] https://github.com/openstack/nova/blob/261de76104ca67bed3ea6cdbcaaab0e44030f1e2/nova/compute/manager.py#L6946
[2] https://github.com/openstack/nova/blob/261de76104ca67bed3ea6cdbcaaab0e44030f1e2/nova/compute/manager.py#L6978