Lock wait timeout exceeded while updating status for floatingips
Bug #1330955 reported by
Ihar Hrachyshka
This bug affects 4 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Medium
|
Ihar Hrachyshka | ||
Icehouse |
Fix Released
|
Medium
|
Ihar Hrachyshka |
Bug Description
Lock timeout occurred when updating floating IP.
2014-06-15 12:50:41.052 15781 TRACE neutron.
This is probably introduced in Icehouse with: https:/
More info at Red Hat bugzilla: https:/
tags: | added: db |
Changed in neutron: | |
importance: | Undecided → Medium |
Changed in neutron: | |
assignee: | nobody → Salvatore Orlando (salvatore-orlando) |
Changed in neutron: | |
milestone: | none → juno-2 |
Changed in neutron: | |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
milestone: | juno-2 → 2014.2 |
To post a comment you must log in.
One interesting thing about this bug is a potential situation of nested locking.
- delete_port will acquire a lock on a port resource- floating_ ips which will probably acquire a lock on a floating IP resource
- this routine will call disassociate_
If the lock on the floating ip is held by some other thread (eg: an update fip status operation), then everything should be fine as soon as that lock is released. However we need to rule out something like the following might happen:
- thread A: update floating IP X, update Port Y
- thread B: delete port Y associated with floating IP X
thread A acquires floating IP X lock
thread B acquires delete port Y lock
thread A wait for Y lock held by thread B, thread B waits for X lock held by thread B --- deadlock!!!
And since we don't have any deadlock detection resolution mechanism - hell will ensue.
Bottom line is: let's go on with this pattern of doing resource-level locks but let's not get carried by it. Let's keep in mind this is a workaround for an eventlet issue, and not a 'final' solution.