kolla-ansible

Bug #1834191
Comment #2

Comment 2 for bug 1834191

Revision history for this message

Mark Goddard (mgoddard) wrote on 2019-06-26:

Another issue was found. During the upgrade, both rocky and stein mariadb containers can be running. In Stein we switched from xtrabackup to mariabackup for the galera state sync, which means that stein and rocky containers cannot sync. I didn't hit this locally, but it was seen in CI. Here are the relevant error messages from the primary node at that time:

2019-06-25 19:05:50 140049019632384 [Note] WSREP: sst_donor_thread signaled with 0
2019-06-25 19:05:50 140044555761408 [Note] WSREP: async IST sender starting to serve tcp://10.209.96.149:4568 sending 8524-8557
sh: wsrep_sst_mariabackup: command not found
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Failed to read from: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass: 2 (No such file or directory)
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Command did not run: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140049068836608 [Warning] WSREP: 1.0 (secondary1): State transfer to 0.0 (secondary2) failed: -2 (No such file or directory)

http://logs.openstack.org/63/667363/3/check/kolla-ansible-centos-source-upgrade-ceph/479cd15/secondary1/logs/kolla/mariadb/mariadb.txt.gz#_2019-06-25_19_05_50

I think we need to shutdown all nodes and perform a recovery in this case.

2019-06-25 19:05:50 140049019632384 [Note] WSREP: sst_donor_thread signaled with 0
2019-06-25 19:05:50 140044555761408 [Note] WSREP: async IST sender starting to serve tcp://10.209.96.149:4568 sending 8524-8557
sh: wsrep_sst_mariabackup: command not found
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Failed to read from: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass: 2 (No such file or directory)
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Command did not run: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/'    --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140049068836608 [Warning] WSREP: 1.0 (secondary1): State transfer to 0.0 (secondary2) failed: -2 (No such file or directory)

http://logs.openstack.org/63/667363/3/check/kolla-ansible-centos-source-upgrade-ceph/479cd15/secondary1/logs/kolla/mariadb/mariadb.txt.gz#_2019-06-25_19_05_50

I think we need to shutdown all nodes and perform a recovery in this case.