gate job fails when unable to start apache service during horizon setup

Bug #1590139 reported by John L. Villalovos
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Critical
Jay Faulkner
devstack
Invalid
Undecided
Unassigned
ironic-python-agent
Fix Released
Critical
Jay Faulkner

Bug Description

Seeing multiple jobs failing intermittently when horizon is being setup.

2016-06-07 16:12:51.384 | + ./stack.sh:main:1289 : start_horizon
2016-06-07 16:12:51.387 | + lib/horizon:start_horizon:180 : restart_apache_server
2016-06-07 16:12:51.389 | + lib/apache:restart_apache_server:188 : time_start restart_apache_server
2016-06-07 16:12:51.392 | + functions-common:time_start:2358 : local name=restart_apache_server
2016-06-07 16:12:51.395 | + functions-common:time_start:2359 : local start_time=
2016-06-07 16:12:51.398 | + functions-common:time_start:2360 : [[ -n '' ]]
2016-06-07 16:12:51.401 | ++ functions-common:time_start:2363 : date +%s
2016-06-07 16:12:51.405 | + functions-common:time_start:2363 : _TIME_START[$name]=1465315971
2016-06-07 16:12:51.407 | + lib/apache:restart_apache_server:189 : stop_service apache2
2016-06-07 16:12:51.410 | + functions-common:stop_service:2300 : '[' -x /bin/systemctl ']'
2016-06-07 16:12:51.413 | + functions-common:stop_service:2303 : sudo service apache2 stop
2016-06-07 16:12:51.431 | * Stopping web server apache2
2016-06-07 16:12:52.554 | *
2016-06-07 16:12:52.559 | + lib/apache:restart_apache_server:190 : sleep 3
2016-06-07 16:12:55.564 | + lib/apache:restart_apache_server:191 : start_service apache2
2016-06-07 16:12:55.567 | + functions-common:start_service:2290 : '[' -x /bin/systemctl ']'
2016-06-07 16:12:55.569 | + functions-common:start_service:2293 : sudo service apache2 start
2016-06-07 16:12:55.583 | * Starting web server apache2
2016-06-07 16:12:55.681 | AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 127.0.0.1. Set the 'ServerName' directive globally to suppress this message
2016-06-07 16:12:55.681 | (98)Address already in use: AH00072: make_sock: could not bind to address [::]:8088
2016-06-07 16:12:55.681 | (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:8088
2016-06-07 16:12:55.681 | no listening sockets available, shutting down
2016-06-07 16:12:55.681 | AH00015: Unable to open logs
2016-06-07 16:12:55.692 | Action 'start' failed.
2016-06-07 16:12:55.692 | The Apache error log may have more information.
2016-06-07 16:13:15.725 | *
2016-06-07 16:13:15.725 | * The apache2 instance did not start within 20 seconds. Please read the log files to discover problems
2016-06-07 16:13:15.729 | + functions-common:start_service:1 : exit_trap
2016-06-07 16:13:15.732 | + ./stack.sh:exit_trap:480 : local r=1
2016-06-07 16:13:15.736 | ++ ./stack.sh:exit_trap:481 : jobs -p
2016-06-07 16:13:15.740 | + ./stack.sh:exit_trap:481 : jobs=18272
2016-06-07 16:13:15.742 | + ./stack.sh:exit_trap:484 : [[ -n 18272 ]]
2016-06-07 16:13:15.746 | + ./stack.sh:exit_trap:484 : [[ -n /opt/stack/new/devstacklog.txt.2016-06-07-155821 ]]
2016-06-07 16:13:15.750 | + ./stack.sh:exit_trap:484 : [[ True == \T\r\u\e ]]
2016-06-07 16:13:15.752 | + ./stack.sh:exit_trap:485 : echo 'exit_trap: cleaning up child processes'
2016-06-07 16:13:15.752 | exit_trap: cleaning up child processes
2016-06-07 16:13:15.755 | + ./stack.sh:exit_trap:486 : kill 18272
2016-06-07 16:13:15.755 | ./stack.sh: line 486: kill: (18272) - No such process

summary: - Gate jobs failing during Horizon setup
+ gate job fails when unable to start apache service during horizon setup
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

https://review.openstack.org/#/c/326739/ proposed elastic-recheck for this bug.

Revision history for this message
Mario Villaplana (mario-villaplana-j) wrote :

This run failed, and it looks like the same problem described above, if an extra example is needed

http://logs.openstack.org/61/298461/12/check/gate-tempest-dsvm-ironic-agent_ssh/2aeea91/

Dmitry Tantsur (divius)
Changed in ironic:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
David Lyle (david-lyle) wrote :

The log indicates that port 8088 is unavailable, that is an ironic port. My assumption would be that an ironic thread did not exit cleanly when apache was stopped and thus the port is unavailable. I think the Horizon reference is a red herring, in that Horizon's config restarts apache too. I think any apache restart has the potential for causing this error. There are no instances of this in the the horizon jobs that I'm aware of.

Revision history for this message
John L. Villalovos (happycamp) wrote :

Seems somewhat related to this old bug:
https://bugs.launchpad.net/devstack/+bug/1342256

Makes me wonder if we need to increase our time between start and stop.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

After talking to the infra team, new zuul launchers use 8088 to stream as part of the work, after chat in #openstack-infra they'll be changing the port in use so our job will work unchanged.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

The infra fix will take multiple days to take effect. I'm also pushing up a change to our devstack configuration in order to unbreak our gate sooner.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/330832

Changed in ironic:
assignee: nobody → Jay Faulkner (jason-oldos)
status: Triaged → In Progress
Changed in ironic-python-agent:
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Jay Faulkner (jason-oldos)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/330839

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/330846

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/330832
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=2898b622f7072995b0c778d2a762adba5afe75aa
Submitter: Jenkins
Branch: master

commit 2898b622f7072995b0c778d2a762adba5afe75aa
Author: Jay Faulkner <email address hidden>
Date: Thu Jun 16 17:35:40 2016 -0700

    Change port used for Ironic static http to 3928

    Due to recent infrastructure changes, infra began using port 8088. This
    caused a conflict with our apache config, causing services to not start.
    Infra is changing this port (Ie67bbba02dbf61a481f66001de3e0dede9448316)
    but it may take up to a week to take effect.

    Instead of having our gate broken for a week, I'm changing the default
    port. 3928 was not chosen at random; it's listed as the netboot-pxe
    service port in /etc/services.

    Change-Id: Ifdcf3c82fd9ee64c1548e47dba4c78c2347959e0
    Closes-bug: 1590139

Changed in ironic:
status: In Progress → Fix Released
Changed in devstack:
status: New → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by Vasyl Saienko (<email address hidden>) on branch: master
Review: https://review.openstack.org/329908

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (stable/mitaka)

Reviewed: https://review.openstack.org/330839
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=f66a437acaf8674df77fa5ddeb8f547ba2638c5d
Submitter: Jenkins
Branch: stable/mitaka

commit f66a437acaf8674df77fa5ddeb8f547ba2638c5d
Author: Jay Faulkner <email address hidden>
Date: Thu Jun 16 17:35:40 2016 -0700

    Change port used for Ironic static http to 3928

    Due to recent infrastructure changes, infra began using port 8088. This
    caused a conflict with our apache config, causing services to not start.
    Infra is changing this port (Ie67bbba02dbf61a481f66001de3e0dede9448316)
    but it may take up to a week to take effect.

    Instead of having our gate broken for a week, I'm changing the default
    port. 3928 was not chosen at random; it's listed as the netboot-pxe
    service port in /etc/services.

    Change-Id: Ifdcf3c82fd9ee64c1548e47dba4c78c2347959e0
    Co-Authored-By: John L. Villalovos <email address hidden>
    Closes-bug: 1590139
    (cherry picked from commit 2898b622f7072995b0c778d2a762adba5afe75aa)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (stable/liberty)

Reviewed: https://review.openstack.org/330846
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=a943831c8dacbd84dc97d1baa5859d71be713458
Submitter: Jenkins
Branch: stable/liberty

commit a943831c8dacbd84dc97d1baa5859d71be713458
Author: Jay Faulkner <email address hidden>
Date: Thu Jun 16 18:40:24 2016 -0700

    Change port used for Ironic static http to 3928

    Due to recent infrastructure changes, infra began using port 8088. This
    caused a conflict with our apache config, causing services to not start.
    Infra is changing this port (Ie67bbba02dbf61a481f66001de3e0dede9448316)
    but it may take up to a week to take effect.

    Instead of having our gate broken for a week, I'm changing the default
    port. 3928 was not chosen at random; it's listed as the netboot-pxe
    service port in /etc/services.

    Change-Id: Ibb635555dc516e54f8d46e2dba0f9cecfa5dd1b3
    Closes-bug: 1590139
    Co-Authored-By: John L. Villalovos <email address hidden>
    (cherry picked from openstack/ironic commit
    2898b622f7072995b0c778d2a762adba5afe75aa)

tags: added: in-stable-liberty
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/ironic 5.1.2

This issue was fixed in the openstack/ironic 5.1.2 release.

Changed in ironic-python-agent:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/ironic 6.0.0

This issue was fixed in the openstack/ironic 6.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic 5.1.2

This issue was fixed in the openstack/ironic 5.1.2 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.