Loadbalancers should be rescheduled when a LBaaS agent goes offline

Bug #1565511 reported by Nir Magnezi
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
octavia
Fix Released
Wishlist
Nir Magnezi

Bug Description

Currently, when a LBaaS agent goes offline the loadbalancers remain under that agent.
In a similar logic to 'allow_automatic_l3agent_failover', the neutron server should reschedule loadbalancers from dead lbaas agents.

this should be enabled with an option as well, such as: allow_automatic_lbaas_agent_failover

Tags: lbaas
Changed in neutron:
assignee: nobody → Nir Magnezi (nmagnezi)
status: New → In Progress
Revision history for this message
Nir Magnezi (nmagnezi) wrote :
Assaf Muller (amuller)
tags: added: lbaas
Nir Magnezi (nmagnezi)
description: updated
Nir Magnezi (nmagnezi)
summary: - Loadbalancers should be rescheduled when a LBaaS agent goes offline
+ [RFE] Loadbalancers should be rescheduled when a LBaaS agent goes
+ offline
Revision history for this message
Nir Magnezi (nmagnezi) wrote : Re: [RFE] Loadbalancers should be rescheduled when a LBaaS agent goes offline

Reverting the "status" to New as per the devref detailing new RFEs (to make clear this has not been decided upon yet).

Changed in neutron:
status: In Progress → New
Nir Magnezi (nmagnezi)
tags: added: rfe
Changed in neutron:
status: New → In Progress
Nir Magnezi (nmagnezi)
Changed in neutron:
status: In Progress → New
Akihiro Motoki (amotoki)
Changed in neutron:
importance: Undecided → Wishlist
status: New → Confirmed
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This is pretty straightforward, and since we have similar patterns for routers and dhcp services, and so long as the same precautions are used to implement this enhancement (keep it switched off by default), this can be treated as a regular bug fix IMO

summary: - [RFE] Loadbalancers should be rescheduled when a LBaaS agent goes
- offline
+ Loadbalancers should be rescheduled when a LBaaS agent goes offline
tags: removed: rfe
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/315074

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/315074
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d9d3c1c6b9e720c85649c10509971fb59f99dab9
Submitter: Jenkins
Branch: master

commit d9d3c1c6b9e720c85649c10509971fb59f99dab9
Author: Nir Magnezi <email address hidden>
Date: Wed May 11 16:22:43 2016 +0300

    Generalise the logic of resource auto rescheduling

    The logic of reschedule_routers_from_down_agents() can be useful for
    additional types of resource rescheduling such as loadbalancers, as described at
    Id8d3218bf1e52722cc10ddcd34e3e734eef90658

    Related-Bug: #1565511

    Change-Id: I7871d68246d82f343e99730c09f81bcc7800bcce

Revision history for this message
Nir Magnezi (nmagnezi) wrote :

This is no longer WIP, ready for reviews: https://review.openstack.org/#/c/299998/

Changed in neutron:
assignee: Nir Magnezi (nmagnezi) → Assaf Muller (amuller)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron-lbaas (master)

Reviewed: https://review.openstack.org/366690
Committed: https://git.openstack.org/cgit/openstack/neutron-lbaas/commit/?id=7cd34433057897219b01ea2871ed7eba9a034d27
Submitter: Jenkins
Branch: master

commit 7cd34433057897219b01ea2871ed7eba9a034d27
Author: Nir Magnezi <email address hidden>
Date: Wed Sep 7 07:12:27 2016 -0400

    Adds neutron_lbaas.conf and services_lbaas.conf to q-svc command line

    When q-lbaasv2 is enabled in local.conf, the neutron-lbaas plugin.sh
    script creates new services_lbaas.conf and neutron_lbaas.conf files
    with some config parameters.

    Under several circumstances, some of the options in those files are
    needed by other neutron daemons, such as the q-svc service.

    This patch modifies the neutron-lbaas devstack plugin to include the
    above mentioned config files in q-svc command line, by adding those
    files to Q_PLUGIN_EXTRA_CONF_FILES.

    Since both config files are shipped in neutron-lbaas, both should be
    included. Starting from Ocata, The service provider option won't
    automatically load to q-svc, so that is another good incentive to have
    it passed with --config-file.

    Closes-Bug: #1619466
    Related-Bug: #1565511

    Change-Id: I652ab029b7427c8783e4b2f0443a89ee884bf064

Changed in neutron:
assignee: Assaf Muller (amuller) → Nir Magnezi (nmagnezi)
affects: neutron → octavia
Revision history for this message
Nir Magnezi (nmagnezi) wrote :
Download full text (10.9 KiB)

How to test (or, how I tested this)

First, In patchset 47 I changed[1] the allow_automatic_lbaas_agent_failover default to 'True' to verify it does find anything wrong in CI.
CI tested Okay and I switched the default value back to False.

Second, for testing locally with devstack (as I did) you'll need at least two devstack nodes:
1. Main node, with all neutron services including the LBaaSv2 agent. (find local.conf here[2])
2. Secondary node, with ovs and LBaaSv2 agents. (local.conf here[3])

In addition, for successful stacking of the secondary node you'll have to comment out the following lines:
https://github.com/openstack/neutron-lbaas/blob/master/devstack/plugin.sh#L114-L116

Lastly, once stacked, set allow_automatic_lbaas_agent_failover=True on /etc/neutron/neutron_lbaas.conf for both nodes.
Restart q-svc (neutron-server) and start the lbaasv2-agent:
$ /usr/local/bin/neutron-lbaasv2-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/neutron_lbaas.conf --config-file /etc/neutron/services/loadbalancer/haproxy/lbaas_agent.ini & echo $! >/opt/stack/status/stack/q-lbaasv2.pid; fg || echo "q-lbaasv2 failed to start" | tee "/opt/stack/status/stack/q-lbaasv2.failure"

[1] https://review.openstack.org/#/c/299998/47/neutron_lbaas/services/loadbalancer/plugin.py@58
[2] https://github.com/nmagnezi/nmagnezi_devstack/blob/master/lbaas_haproxy_namespace_main.local.conf
[3] https://github.com/nmagnezi/nmagnezi_devstack/blob/master/lbaas_haproxy_namespace_secondary.local.conf

The end result should look like this:

$ neutron agent-list
+--------------------------------------+----------------------+-------------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+----------------------+-------------------+-------------------+-------+----------------+---------------------------+
| 2af49b85-7a55-4420-97e0-186c233cce08 | Open vSwitch agent | haproxy-devstack1 | | :-) | True | neutron-openvswitch-agent |
| 2d81c836-2f85-47c2-9cdc-665aa796e977 | DHCP agent | haproxy-devstack1 | nova | :-) | True | neutron-dhcp-agent |
| 58fa7369-ea35-4663-ae34-97518e847741 | Open vSwitch agent | haproxy-devstack2 | | :-) | True | neutron-openvswitch-agent |
| 7b665b9d-4c7e-4da1-a37a-1007af6444fc | Loadbalancerv2 agent | haproxy-devstack1 | | :-) | True | neutron-lbaasv2-agent |
| 88f4c436-7152-4d30-a9e8-a793750bcbba | Loadbalancerv2 agent | haproxy-devstack2 | | :-) | True | neutron-lbaasv2-agent |
| de6640a1-17a7-4ceb-986a-3b0de3b8845e | Metadata agent | haproxy-devstack1 | | :-) | True | neutron-metadata-agent |
| e4f77843-48e9-43af-a1af-884c07714416 | L3 agent | haproxy-devstack1 | nova | :-) | True | neutron-l3-agent |
+--------------------------------------+---------------...

Revision history for this message
Nir Magnezi (nmagnezi) wrote :

Easier to view the last comment in: http://paste.openstack.org/show/591851/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron-lbaas 10.0.0.0b3

This issue was fixed in the openstack/neutron-lbaas 10.0.0.0b3 development milestone.

Nir Magnezi (nmagnezi)
Changed in octavia:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.