The undercloud-upgrades job is failing during the upgrade with "error was: 'ironic_api_short_bootstrap_node_name' is undefined"

Bug #1798525 reported by Marios Andreou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Marios Andreou

Bug Description

The undercloud-upgrades and containerized-undercloud-upgrades job is failing during the upgrade with "error was: 'ironic_api_short_bootstrap_node_name' is undefined" trace looks like:

   2018-10-18 01:20:20 | TASK [set is_bootstrap_node fact] **********************************************
   2018-10-18 01:20:20 | fatal: [centos-7-vexxhost-sjc1-0003163573]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'ironic_api_short_bootstrap_node_name' is undefined\n\nThe error appears to have been in '/home/zuul/undercloud-ansible-Rjecdu/Undercloud/upgrade_tasks.yaml': line 112, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n - name: set is_bootstrap_node fact\n ^ here\n"}
   2018-10-18 01:20:20 |
   2018-10-18 01:20:20 | PLAY RECAP *********************************************************************
   2018-10-18 01:20:20 | centos-7-vexxhost-sjc1-0003163573 : ok=23 changed=8 unreachable=0 failed=1
   2018-10-18 01:20:20 |
   2018-10-18 01:20:20 | Exception: Upgrade failed

Examples at [1,2,3] - error coming from [4] and looks to be related to [5] which merged last night

[1] http://logs.openstack.org/75/610475/4/check/tripleo-ci-centos-7-undercloud-upgrades/e465609/logs/undercloud/home/zuul/undercloud_upgrade.log.txt.gz#_2018-10-17_23_24_58
[2] http://logs.openstack.org/45/560445/166/check/tripleo-ci-centos-7-undercloud-upgrades/496cefb/logs/undercloud/home/zuul/undercloud_upgrade.log.txt.gz
[3] http://logs.openstack.org/18/610018/9/check/tripleo-ci-centos-7-containerized-undercloud-upgrades/2042382/job-output.txt.gz#_2018-10-18_03_13_21_507390
[4] https://github.com/openstack/tripleo-heat-templates/blob/3760c074c2a34fa5a730e9b4f6fffc7aa582348b/docker/services/ironic-api.yaml#L161
[5] https://review.openstack.org/#/c/605430/

Tags: ci
Changed in tripleo:
assignee: nobody → Marios Andreou (marios-b)
description: updated
description: updated
Revision history for this message
Quique Llorente (quiquell) wrote :

It was passing in the review, maybe something is broken with build-test-package so it didn't test the review ?
http://logs.openstack.org/30/605430/5/check/tripleo-ci-centos-7-containerized-undercloud-upgrades/3ac93a2/job-output.txt.gz

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/611518

Revision history for this message
Marios Andreou (marios-b) wrote :

posted the revert ^^^ but not sure if we need it yet... the error is pretty clear but why did it not fail on the review?

Revision history for this message
Quique Llorente (quiquell) wrote :
Changed in tripleo:
milestone: none → stein-1
Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Marios Andreou (marios-b) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/611518
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=3cbaadd09c034629fb20e3c663ad64b3b468f77b
Submitter: Zuul
Branch: master

commit 3cbaadd09c034629fb20e3c663ad64b3b468f77b
Author: Marios Andreou <email address hidden>
Date: Thu Oct 18 07:20:45 2018 +0000

    Revert "Convert *tasks from bootstrap_nodeid to short_bootstrap_node_name"

    This reverts commit 52c1641e2c3ad5caeb70fc8a09f29eba6fe5b53d due to the related bug below

    Change-Id: I3f6d8adae1918d1d55fdecc09fed5e4b45ee46b9
    Related-Bug: 1798525

Revision history for this message
Sagi (Sergey) Shnaidman (sshnaidm) wrote :

removed alert as it doesn't fail now

tags: removed: alert
Revision history for this message
Ronelle Landy (rlandy) wrote :
Revision history for this message
Ronelle Landy (rlandy) wrote :

Possibly the change job is not failing in CI because update_containers is set to false.
Trying patch out

Revision history for this message
Ronelle Landy (rlandy) wrote :

Ignore comment above - ...
 "update_containers": "true",
in the change job

Revision history for this message
Ronelle Landy (rlandy) wrote :

Looking at the traces after merge that show the errors and the ones in the change job that do not, here is an observation ...

In job tripleo-ci-centos-7-undercloud-upgrades, "containerized_undercloud": false
In job tripleo-ci-centos-7-containerized-undercloud-upgrades, "containerized_undercloud": false.

#container_images_file = <None> in the undercloud.conf file

Looking at how the changes from a gating repo are included, they are added to a containers-prepare-parameter.yaml file as in: http://logs.openstack.org/30/605430/5/check/tripleo-ci-centos-7-undercloud-containers/5471e9d/logs/undercloud/home/zuul/containers-prepare-parameter.yaml.txt.gz.

and they are included in the undercloud.conf:

container_images_file = /home/zuul/containers-prepare-parameter.yaml

https://github.com/openstack/tripleo-upgrade/blob/master/tasks/upgrade/configure_uc_containers.yml#L28, the container_images_file is only updated if there is a custom one - not the default that would have been included if containerized_undercloud was true as in:

https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/undercloud-deploy/templates/undercloud.conf.j2#L294

Changed in tripleo:
importance: Undecided → High
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
milestone: stein-3 → stein-rc1
Changed in tripleo:
milestone: stein-rc1 → train-1
Changed in tripleo:
milestone: train-1 → train-2
Changed in tripleo:
milestone: train-2 → train-3
Changed in tripleo:
milestone: train-3 → ussuri-1
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.