docker-puppet not always returning proper error code

Bug #1733402 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

So I was given an environment for some unrelated issue and I noticed that docker-puppet did not return error even though puppet did indeed fail. This was on a recent master deplyment with all the right --detailed-exitcodes patches.
1) All status_codes are successful:
$ grep status_code var/lib/heat-config/deployed/* |grep ": 0" |wc -l
17
$ grep status_code var/lib/heat-config/deployed/* |wc -l
17

2) But puppet did indeed fail in at least one docker-puppet run:

"Error: Evaluation Error: Error while evaluating a Function Call, wrong number of arguments (0 for 1+) at /etc/puppet/modules/tripleo/manifests/profile/base/rabbitmq.pp:123:8 on node overcloud-controller-0.localdomain",
"+ rc=1",
"+ set -e",
"+ '[' 1 -ne 2 -a 1 -ne 0 ']'",
"+ exit 1",
"2017-11-19 22:36:06,232 INFO: 21121 -- Finished processing puppet configs for rabbitmq",
"2017-11-19 22:36:09,261 INFO: 21122 -- Removing container: docker-puppet-horizon",
"2017-11-19 22:36:09,310 INFO: 21122 -- Finished processing puppet configs for horizon",
"2017-11-19 22:36:09,311 ERROR: 21119 -- ERROR configuring rabbitmq"

Since this seems to be a race, I wonder if the code somewhere in docker-puppet.py is racy when collecting the results from the parallel docker-puppet runs *or* if docker is not always returning the right error code back from the container?

Revision history for this message
Michele Baldessari (michele) wrote :

No this seems just the broken way we are doing stuff with the ansible log collection:
$ ansible --version |grep ^ansible
ansible 2.4.1.0
$ more playbooks/when_test.yaml
---
- hosts: localhost
  gather_facts: no
  tasks:
  - name: return code
    shell: |
      echo "fail 1"
      exit 2
    failed_when: false
    log_when: false
    register: outputs
  - debug:
      msg: "rc: {{ outputs.rc }}"
  - debug: var=(outputs.stderr|default('')).split('\n')|union(outputs.stdout_lines|default([]))
    when: outputs is defined
    failed_when: outputs|failed

Now when we run the playbook:
$ ansible-playbook playbooks/when_test.yaml
PLAY [localhost] *****************************************************************************************************************************************************************************

TASK [return code] ***************************************************************************************************************************************************************************
changed: [localhost]

TASK [debug] *********************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": "rc: 2"
}

TASK [debug] *********************************************************************************************************************************************************************************
ok: [localhost] => {
    "(outputs.stderr|default('')).split('\n')|union(outputs.stdout_lines|default([]))": [
        "",
        "fail 1"
    ],
    "failed_when_result": false
}

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost : ok=3 changed=1 unreachable=0 failed=0

Notice how failed=0. It seems that our 'outputs|failed' does not work at all in the debug log collecting statements

Changed in tripleo:
assignee: nobody → Michele Baldessari (michele)
status: Triaged → In Progress
Revision history for this message
Michele Baldessari (michele) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/521646
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=ed2b957a4f8682c8696b0e0417aad2b913832e2a
Submitter: Zuul
Branch: master

commit ed2b957a4f8682c8696b0e0417aad2b913832e2a
Author: Michele Baldessari <email address hidden>
Date: Mon Nov 20 20:48:50 2017 +0100

    Fix all outputs|failed and outputs is defined

    The ansible "failed_when" filter that uses a registered output
    of a previous task piped to the '|failed' filter does not work
    as expected. Given the following playbook:

      - name: return code
        shell: |
          echo "fail 2"
          exit 2
        failed_when: false
        log_when: false
        register: outputs
      - debug:
          msg: "rc: {{ outputs.rc }}"
      - debug: msg="Broken (does not fail as expected)"
        when: outputs is defined
        failed_when: outputs|failed
      - debug: msg="Working (fails as expected)"
        when: outputs is defined
        failed_when: outputs.rc != 0

    We obtain the following output:

    TASK [return code] ****
    changed: [localhost]

    TASK [debug] **********
    ok: [localhost] => {
        "msg": "rc: 2"
    }

    TASK [debug] **********
    ok: [localhost] => {
        "failed_when_result": false,
        "msg": "Broken (does not fail as expected)"
    }

    TASK [debug] **********
    fatal: [localhost]: FAILED! => {
        "failed_when_result": true,
        "msg": "Working (fails as expected)"
    }

    This means that the 'outputs|failed' just does not work at all.
    Let's move to a more explicit check on the rc code of the registered
    variable.

    We also need to fix all the "outputs is defined" checks, because
    when a task is skipped the registered outputs variable *is* actually
    defined as the following dictionary:
    {'skip_reason': u'Conditional result was False', 'skipped': True, 'changed': False}

    So we use "outputs.rc is defined" in order to make sure that the
    previous task did indeed run.

    Closes-Bug: #1733402

    Change-Id: I6ef53dc3f9aede42f10c7f110d24722355481261

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/521828

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/pike)

Reviewed: https://review.openstack.org/521828
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=b748d68a96f45b657ffc5a0477c374632171f3c9
Submitter: Zuul
Branch: stable/pike

commit b748d68a96f45b657ffc5a0477c374632171f3c9
Author: Michele Baldessari <email address hidden>
Date: Mon Nov 20 20:48:50 2017 +0100

    Fix all outputs|failed and outputs is defined

    The ansible "failed_when" filter that uses a registered output
    of a previous task piped to the '|failed' filter does not work
    as expected. Given the following playbook:

      - name: return code
        shell: |
          echo "fail 2"
          exit 2
        failed_when: false
        log_when: false
        register: outputs
      - debug:
          msg: "rc: {{ outputs.rc }}"
      - debug: msg="Broken (does not fail as expected)"
        when: outputs is defined
        failed_when: outputs|failed
      - debug: msg="Working (fails as expected)"
        when: outputs is defined
        failed_when: outputs.rc != 0

    We obtain the following output:

    TASK [return code] ****
    changed: [localhost]

    TASK [debug] **********
    ok: [localhost] => {
        "msg": "rc: 2"
    }

    TASK [debug] **********
    ok: [localhost] => {
        "failed_when_result": false,
        "msg": "Broken (does not fail as expected)"
    }

    TASK [debug] **********
    fatal: [localhost]: FAILED! => {
        "failed_when_result": true,
        "msg": "Working (fails as expected)"
    }

    This means that the 'outputs|failed' just does not work at all.
    Let's move to a more explicit check on the rc code of the registered
    variable.

    We also need to fix all the "outputs is defined" checks, because
    when a task is skipped the registered outputs variable *is* actually
    defined as the following dictionary:
    {'skip_reason': u'Conditional result was False', 'skipped': True, 'changed': False}

    So we use "outputs.rc is defined" in order to make sure that the
    previous task did indeed run.

    Closes-Bug: #1733402

    NB: Cherry-pick slightly changed due to missing kubernetes yaml file

    Change-Id: I6ef53dc3f9aede42f10c7f110d24722355481261
    (cherry picked from commit ed2b957a4f8682c8696b0e0417aad2b913832e2a)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 8.0.0.0b2

This issue was fixed in the openstack/tripleo-heat-templates 8.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.6

This issue was fixed in the openstack/tripleo-heat-templates 7.0.6 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.