Revisions after 3.9/stable 177 and 3.8/stable 176 have the rabbitmqserver perpetually stuck in waiting state

Bug #2030524 reported by Tim Andersson
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack RabbitMQ Server Charm
Expired
Undecided
Unassigned

Bug Description

Environment defined by this:
https://code.launchpad.net/~ubuntu-release/autopkgtest-cloud/+git/autopkgtest-cloud/+ref/master

Deployment etc described here:
https://autopkgtest-cloud.readthedocs.io/en/latest/

With revisions after what's described above, doing a mojo run in the environment leaves juju timing out due to waiting. Juju status is in this state:

```
rabbitmq-server/8* waiting idle 62 185.125.191.10 5672/tcp,15672/tcp Not reached target cluster-partition-handling mode
```

So I dug a little deeper, and the target cluster-partition-handling mode is actually reached:
```
sudo rabbitmqctl eval 'application:get_all_env(rabbit).' | grep cluster_partition_handling
 {cluster_partition_handling,ignore},
```

Which matches the conf file:
cat /etc/rabbitmq/rabbitmq.conf
collect_statistics_interval = 30000
mnesia_table_loading_retry_timeout = 30000
mnesia_table_loading_retry_limit = 10
cluster_partition_handling = ignore

For reproduction, just try and mojo run a revision after what I detailed above with cluster_partition_handling set to ignore and just one rabbitmq server.

Openstack version:
openstack 0.3.0
Juju version:
2.9.42-ubuntu-amd64
Ubuntu release:
14.04 Trusty

Related branches

Revision history for this message
Brian Murray (brian-murray) wrote :

The charm is running on the following server:

ubuntu@juju-806ee7-stg-proposed-migration-62:~$ apt-cache policy rabbitmq-server
rabbitmq-server:
  Installed: 3.8.2-0ubuntu1.4
  Candidate: 3.8.2-0ubuntu1.4
  Version table:
 *** 3.8.2-0ubuntu1.4 500
        500 http://prodstack-zone-1.clouds.archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     3.8.2-0ubuntu1.3 500
        500 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages
     3.8.2-0ubuntu1 500
        500 http://prodstack-zone-1.clouds.archive.ubuntu.com/ubuntu focal/main amd64 Packages
ubuntu@juju-806ee7-stg-proposed-migration-62:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"

Revision history for this message
Felipe Reyes (freyes) wrote : Re: [Bug 2030524] [NEW] Revisions after 3.9/stable 177 and 3.8/stable 176 have the rabbitmqserver perpetually stuck in waiting state

On Mon, 2023-08-07 at 15:59 +0000, Tim Andersson wrote:
>
> ```
> rabbitmq-server/8*            waiting   idle   62       185.125.191.10   5672/tcp,15672/tcp
> Not reached target cluster-partition-handling mode
> ```

The bundle can be found at [0], this is using "cs:rabbitmq-server" (charmstore), there was a change
recently[1] where the default track is "3.9"[2].

When using "cs:rabbitmq-server" I get revision 118:

$ juju deploy --series focal cs:rabbitmq-server
Located charm "rabbitmq-server" in charm-store, revision 118
Deploying "rabbitmq-server" from charm-store charm "rabbitmq-server", revision 118 in channel stable
on focal

Now if we take the subject of this bug and use revision 177 (via charmhub), a single node rabbitmq-
server gets deployed.

$ juju deploy --series focal ch:rabbitmq-server rabbitmq-server-ch
Located charm "rabbitmq-server" in charm-hub, revision 177
Deploying "rabbitmq-server-ch" from charm-hub charm "rabbitmq-server", revision 177 in channel
3.9/stable on focal
$ juju status rabbitmq-server-ch
Model Controller Cloud/Region Version SLA Timestamp
rabbit serverstack serverstack/serverstack 2.9.44 unsupported 18:24:21-04:00

App Version Status Scale Charm Channel Rev Exposed Message
rabbitmq-server-ch 3.8.2 active 1 rabbitmq-server 3.9/stable 177 no Unit is ready

Unit Workload Agent Machine Public address Ports Message
rabbitmq-server-ch/0* active idle 1 10.5.3.135 5672/tcp,15672/tcp Unit is ready

Machine State Address Inst id Series AZ Message
1 started 10.5.3.135 7b2c9957-acd9-455c-8dfe-fafde0ccbba1 focal nova ACTIVE

I wonder if there is a race condition hidden, if you see this issue again please make sure to
capture the /var/log/juju/ directory and attach it to this bug.

[0] https://git.launchpad.net/autopkgtest-cloud/tree/mojo/service-bundle#n252
[1]
https://discourse.charmhub.io/t/request-please-change-the-default-channel-for-the-following-openstack-ovn-and-ceph-charms/11245
[2] https://charmhub.io/rabbitmq-server?channel=3.9/stable

Changed in charm-rabbitmq-server:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack RabbitMQ Server Charm because there has been no activity for 60 days.]

Changed in charm-rabbitmq-server:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.