Timeout trying to delete overcloud stack in CI
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Expired
|
Undecided
|
Unassigned |
Bug Description
Seen here:
http://
2016-01-14 14:23:58.898 | #################
2016-01-14 14:23:58.898 | tripleo.sh -- Overcloud delete
2016-01-14 14:23:58.898 | #################
2016-01-14 14:24:02.016 | +------
2016-01-14 14:24:02.016 | | id | stack_name | stack_status | creation_time | updated_time |
2016-01-14 14:24:02.017 | +------
2016-01-14 14:24:02.017 | | 9a8a8a20-
2016-01-14 14:24:02.017 | +------
2016-01-14 14:29:19.168 | #################
2016-01-14 14:29:19.168 | tripleo.sh -- Overcloud 9a8a8a20-
2016-01-14 14:29:19.168 | #################
2016-01-14 14:29:19.168 | /tmp/tripleo.sh: line 447: Timing: command not found
The "Timing: command not found" is odd, it's not coming from tripleo.sh, but we can't see where it actually came from (we just do a heat stack-show after the delete timeout in tripleo.sh).
Looking later in the logs we see:
2016-01-14 14:29:36.653 | | stack_status | DELETE_IN_PROGRESS
And the events show:
2016-01-14 14:31:44.207 | | Controller | c54bc713-
which is the last event, relating to this resource, which I think is an OS::Nova::Server resource (we don't list resources recursively in the CI error path, so it's hard to be sure, we should fix that).
Looking at the heat logs:
2016-01-14 14:24:01.861 16286 INFO heat.engine.stack [-] Stack DELETE IN_PROGRESS (overcloud): Stack DELETE started
...
2016-01-14 14:35:00.891 16286 INFO heat.engine.stack [-] Stack DELETE COMPLETE (overcloud): Stack DELETE completed successfully
It appears the delete did actually work, but it took too long as the job timed out.
11 minutes seems excessive - my local 2 node stacks delete in about 40 seconds, so we may have an issue to diagnose specific to the HA job/stack here.
This bug is > 365 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.
If the bug is still valid, then update the bug status.