StarlingX

Bug #1888546
Comment #14

Comment 14 for bug 1888546

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-08-27: Fix merged to config (r/stx.3.0)

#14

Reviewed: https://review.opendev.org/747124
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=245023894acfb163b4ed73ccded72914550d982c
Submitter: Zuul
Branch: r/stx.3.0

commit 245023894acfb163b4ed73ccded72914550d982c
Author: Martin, Chen <email address hidden>
Date: Thu Aug 20 16:26:50 2020 +0800

Update mariadb-server suspect_timeout to default value to align
with garbd's suspect_timeout

    In openstack-helm-infra, it launch evs.suspect_timeout=PT30S
    for mariadb-server in configmap, mariadb-etc. This setting is
    for three mariadb-server pod deployment, every mariadb-server
    with same setting suspect_timeout=30s. But after change to two
    mariadb-server and one garbd arbitrator. Setting in configmap
    mariadb-etc evs.suspect_timeout=PT30S, only takes effect for 2
    mariadb-server, for garbd arbitrator, it use galera default
    setting evs.suspect_timeout=PT5S. If mariadb-server-1 exit
    abnormal, after 5s, garbd arbitrator suspects mariadb-server-1
    is dead, but as not reach 30s, mariadb-server-0 thinks mariadb-server-1
    is not dead. In this state, quorum fail, garbd arbitrator and
    mariadb-server-0 both set to none primary component, service
    down.
    For fix solution, set value.conf.data.config_override to override
    wsrep_provider_option in mariadb helm chart, which makes garbd
    arbitrator and mariadb-server launch with same setting for
    "evs.suspect_timeout=PT5S", default value. By this way, mariadb
    server recovery time will also improve. To update setting for
    "evs.suspect_timeout", it should both update override for mariadb
    and garbd helm chart.

Setting for "gmcast.listen_addr=tcp://0.0.0.0:<port>", takes
effect for both ipv4 and ipv6. So keeps such setting.

    Reference link for wsrep option and galera cluster quorum
    https://mariadb.com/kb/en/wsrep_provider_options/
    https://galeracluster.com/library/documentation/weighted-quorum.html

Closes-Bug: 1888546

Change-Id: I92af77fab929c9f598b7dc41543db6ad6238f812
Signed-off-by: Martin, Chen <email address hidden>

Reviewed:  https://review.opendev.org/747124
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=245023894acfb163b4ed73ccded72914550d982c
Submitter: Zuul
Branch:    r/stx.3.0

commit 245023894acfb163b4ed73ccded72914550d982c
Author: Martin, Chen <haochuan.z.chen@intel.com>
Date:   Thu Aug 20 16:26:50 2020 +0800

Update mariadb-server suspect_timeout to default value to align
    with garbd's suspect_timeout
    
    In openstack-helm-infra, it launch evs.suspect_timeout=PT30S
    for mariadb-server in configmap, mariadb-etc. This setting is
    for three mariadb-server pod deployment, every mariadb-server
    with same setting suspect_timeout=30s. But after change to two
    mariadb-server and one garbd arbitrator. Setting in configmap
    mariadb-etc evs.suspect_timeout=PT30S, only takes effect for 2
    mariadb-server, for garbd arbitrator, it use galera default
    setting evs.suspect_timeout=PT5S. If mariadb-server-1 exit
    abnormal, after 5s, garbd arbitrator suspects mariadb-server-1
    is dead, but as not reach 30s, mariadb-server-0 thinks mariadb-server-1
    is not dead. In this state, quorum fail, garbd arbitrator and
    mariadb-server-0 both set to none primary component, service
    down.
    For fix solution, set value.conf.data.config_override to override
    wsrep_provider_option in mariadb helm chart, which makes garbd
    arbitrator and mariadb-server launch with same setting for
    "evs.suspect_timeout=PT5S", default value. By this way, mariadb
    server recovery time will also improve. To update setting for
    "evs.suspect_timeout", it should both update override for mariadb
    and garbd helm chart.
    
    Setting for "gmcast.listen_addr=tcp://0.0.0.0:<port>", takes
    effect for both ipv4 and ipv6. So keeps such setting.
    
    Reference link for wsrep option and galera cluster quorum
    https://mariadb.com/kb/en/wsrep_provider_options/
    https://galeracluster.com/library/documentation/weighted-quorum.html
    
    Closes-Bug: 1888546
    
    Change-Id: I92af77fab929c9f598b7dc41543db6ad6238f812
    Signed-off-by: Martin, Chen <haochuan.z.chen@intel.com>