Another issue was found. During the upgrade, both rocky and stein mariadb containers can be running. In Stein we switched from xtrabackup to mariabackup for the galera state sync, which means that stein and rocky containers cannot sync. I didn't hit this locally, but it was seen in CI. Here are the relevant error messages from the primary node at that time:
2019-06-25 19:05:50 140049019632384 [Note] WSREP: sst_donor_thread signaled with 0
2019-06-25 19:05:50 140044555761408 [Note] WSREP: async IST sender starting to serve tcp://10.209.96.149:4568 sending 8524-8557
sh: wsrep_sst_mariabackup: command not found
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Failed to read from: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Process completed with error: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass: 2 (No such file or directory)
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Command did not run: wsrep_sst_mariabackup --role 'donor' --address '10.209.96.149:4444/xtrabackup_sst//1' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf-9773-11e9-b90b-e6ca413f0ef1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140049068836608 [Warning] WSREP: 1.0 (secondary1): State transfer to 0.0 (secondary2) failed: -2 (No such file or directory)
Another issue was found. During the upgrade, both rocky and stein mariadb containers can be running. In Stein we switched from xtrabackup to mariabackup for the galera state sync, which means that stein and rocky containers cannot sync. I didn't hit this locally, but it was seen in CI. Here are the relevant error messages from the primary node at that time:
2019-06-25 19:05:50 140049019632384 [Note] WSREP: sst_donor_thread signaled with 0 209.96. 149:4568 sending 8524-8557 mariabackup: command not found mariabackup --role 'donor' --address '10.209. 96.149: 4444/xtrabackup _sst//1' --socket '/var/lib/ mysql/mysql. sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf- 9773-11e9- b90b-e6ca413f0e f1:8523' --gtid-domain-id '0' --bypass mariabackup --role 'donor' --address '10.209. 96.149: 4444/xtrabackup _sst//1' --socket '/var/lib/ mysql/mysql. sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf- 9773-11e9- b90b-e6ca413f0e f1:8523' --gtid-domain-id '0' --bypass: 2 (No such file or directory) mariabackup --role 'donor' --address '10.209. 96.149: 4444/xtrabackup _sst//1' --socket '/var/lib/ mysql/mysql. sock' --datadir '/var/lib/mysql/' --binlog 'mysql-bin' --gtid 'b99680cf- 9773-11e9- b90b-e6ca413f0e f1:8523' --gtid-domain-id '0' --bypass
2019-06-25 19:05:50 140044555761408 [Note] WSREP: async IST sender starting to serve tcp://10.
sh: wsrep_sst_
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Failed to read from: wsrep_sst_
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Process completed with error: wsrep_sst_
2019-06-25 19:05:50 140044564154112 [ERROR] WSREP: Command did not run: wsrep_sst_
2019-06-25 19:05:50 140049068836608 [Warning] WSREP: 1.0 (secondary1): State transfer to 0.0 (secondary2) failed: -2 (No such file or directory)
http:// logs.openstack. org/63/ 667363/ 3/check/ kolla-ansible- centos- source- upgrade- ceph/479cd15/ secondary1/ logs/kolla/ mariadb/ mariadb. txt.gz# _2019-06- 25_19_05_ 50
I think we need to shutdown all nodes and perform a recovery in this case.