(2:26:59 PM) mriedem: here is when update_available_resource starts:
(2:26:59 PM) mriedem: 2014-09-19 10:23:00.539 21095 DEBUG nova.openstack.common.periodic_task [-] Running periodic task ComputeManager.update_available_resource run_periodic_tasks /opt/stack/new/nova/nova/openstack/common/periodic_task.py:193
(2:27:05 PM) mriedem: 2014-09-19 10:23:00.540 21095 DEBUG nova.openstack.common.lockutils [-] Acquired semaphore "compute_resources" lock /opt/stack/new/nova/nova/openstack/common/lockutils.py:229
(2:27:05 PM) alaski: I'm looking about three lines up
(2:27:26 PM) mriedem: then we fail and release that lock
(2:27:26 PM) alaski: 2014-09-19 10:23:07.991 DEBUG nova.openstack.common.lockutils [req-5a467708-5da4-446b-b729-c44d005f84c3 DeleteServersTestJSON-14297441 DeleteServersTestJSON-1574463718] Using existing semaphore "compute_resources" internal_lock /opt/stack/new/nova/nova/openstack/common/lockutils.py:202
(2:27:26 PM) mriedem: 2014-09-19 10:24:31.404 21095 DEBUG nova.openstack.common.lockutils [-] Semaphore / lock released "update_available_resource" inner /opt/stack/new/nova/nova/openstack/common/lockutils.py:275
(2:27:56 PM) mriedem: 10:23:00 to 10:24:31
(2:27:56 PM) alaski: 2014-09-19 10:24:31.623 DEBUG nova.openstack.common.lockutils [req-5a467708-5da4-446b-b729-c44d005f84c3 DeleteServersTestJSON-14297441 DeleteServersTestJSON-1574463718] Acquired semaphore "compute_resources" lock /opt/stack/new/nova/nova/openstack/common/lockutils.py:229
(2:27:56 PM) alaski: req-5a467708-5da4-446b-b729-c44d005f84c3 is what's timing out
(2:28:07 PM) mriedem: yeah so that's your 90 seconds
(2:28:13 PM) mriedem: where we have a 60 second messaging timeout in between
(2:29:16 PM) alaski: yep
(2:30:40 PM) alaski: it seems that update_available_resources needs to be reworked so that it doesn't hold the lock too long
(2:30:49 PM) mriedem: yeah
Some investigation in IRC:
(2:26:59 PM) mriedem: here is when update_ available_ resource starts: common. periodic_ task [-] Running periodic task ComputeManager. update_ available_ resource run_periodic_tasks /opt/stack/ new/nova/ nova/openstack/ common/ periodic_ task.py: 193 common. lockutils [-] Acquired semaphore "compute_resources" lock /opt/stack/ new/nova/ nova/openstack/ common/ lockutils. py:229 common. lockutils [req-5a467708- 5da4-446b- b729-c44d005f84 c3 DeleteServersTe stJSON- 14297441 DeleteServersTe stJSON- 1574463718] Using existing semaphore "compute_resources" internal_lock /opt/stack/ new/nova/ nova/openstack/ common/ lockutils. py:202 common. lockutils [-] Semaphore / lock released "update_ available_ resource" inner /opt/stack/ new/nova/ nova/openstack/ common/ lockutils. py:275 common. lockutils [req-5a467708- 5da4-446b- b729-c44d005f84 c3 DeleteServersTe stJSON- 14297441 DeleteServersTe stJSON- 1574463718] Acquired semaphore "compute_resources" lock /opt/stack/ new/nova/ nova/openstack/ common/ lockutils. py:229 5da4-446b- b729-c44d005f84 c3 is what's timing out available_ resources needs to be reworked so that it doesn't hold the lock too long
(2:26:59 PM) mriedem: 2014-09-19 10:23:00.539 21095 DEBUG nova.openstack.
(2:27:05 PM) mriedem: 2014-09-19 10:23:00.540 21095 DEBUG nova.openstack.
(2:27:05 PM) alaski: I'm looking about three lines up
(2:27:26 PM) mriedem: then we fail and release that lock
(2:27:26 PM) alaski: 2014-09-19 10:23:07.991 DEBUG nova.openstack.
(2:27:26 PM) mriedem: 2014-09-19 10:24:31.404 21095 DEBUG nova.openstack.
(2:27:56 PM) mriedem: 10:23:00 to 10:24:31
(2:27:56 PM) alaski: 2014-09-19 10:24:31.623 DEBUG nova.openstack.
(2:27:56 PM) alaski: req-5a467708-
(2:28:07 PM) mriedem: yeah so that's your 90 seconds
(2:28:13 PM) mriedem: where we have a 60 second messaging timeout in between
(2:29:16 PM) alaski: yep
(2:30:40 PM) alaski: it seems that update_
(2:30:49 PM) mriedem: yeah