regression: rabbimq cluster is broken with Juju 1.22

Bug #1483949 reported by Nobuto Murata
This bug report is a duplicate of:  Bug #1486177: 3-node native rabbitmq cluster race. Edit Remove
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
rabbitmq-server (Juju Charms Collection)
New
Undecided
Unassigned

Bug Description

[the latest revision charm-store.33]
$ juju deploy -n2 cs:trusty/rabbitmq-server
Added charm "cs:trusty/rabbitmq-server-33" to the environment.

$ juju run --service rabbitmq-server 'sudo rabbitmqctl cluster_status'
- MachineId: "1"
  Stdout: |
    Cluster status of node 'rabbit@juju-nobuto-machine-1' ...
    [{nodes,[{disc,['rabbit@juju-nobuto-machine-1']}]},
     {running_nodes,['rabbit@juju-nobuto-machine-1']},
     {partitions,[]}]
    ...done.
  UnitId: rabbitmq-server/0
- MachineId: "2"
  Stdout: |
    Cluster status of node 'rabbit@juju-nobuto-machine-2' ...
    [{nodes,[{disc,['rabbit@juju-nobuto-machine-2']}]},
     {running_nodes,['rabbit@juju-nobuto-machine-2']},
     {partitions,[]}]
    ...done.
  UnitId: rabbitmq-server/1

^^^ not clustered properly. they don't know each other.

[the previous revision charm-store.32]
$ juju deploy -n2 cs:trusty/rabbitmq-server-32 rabbitmq-server-previous
Added charm "cs:trusty/rabbitmq-server-32" to the environment.

$ juju run --service rabbitmq-server-previous 'sudo rabbitmqctl cluster_status'
- MachineId: "3"
  Stdout: |
    Cluster status of node 'rabbit@juju-nobuto-machine-3' ...
    [{nodes,[{disc,['rabbit@juju-nobuto-machine-3',
                    'rabbit@juju-nobuto-machine-4']}]},
     {running_nodes,['rabbit@juju-nobuto-machine-4',
                     'rabbit@juju-nobuto-machine-3']},
     {partitions,[]}]
    ...done.
  UnitId: rabbitmq-server-previous/0
- MachineId: "4"
  Stdout: |
    Cluster status of node 'rabbit@juju-nobuto-machine-4' ...
    [{nodes,[{disc,['rabbit@juju-nobuto-machine-3',
                    'rabbit@juju-nobuto-machine-4']}]},
     {running_nodes,['rabbit@juju-nobuto-machine-3',
                     'rabbit@juju-nobuto-machine-4']},
     {partitions,[]}]
    ...done.
  UnitId: rabbitmq-server-previous/1

^^^ properly clustered.

Tags: cpec
Revision history for this message
Nobuto Murata (nobuto) wrote :

attached all-machines.log.gz.

I can reproduce this issue both in openstack provider and maas provider with Juju 1.22.6-0ubuntu1~14.04.1.

Nobuto Murata (nobuto)
tags: added: cpec
Revision history for this message
Ante Karamatić (ivoks) wrote :

Have you configured min-cluster-size?

Revision history for this message
Nobuto Murata (nobuto) wrote :

> Have you configured min-cluster-size?

nope in the example above. but yes for my previous deployment affected.

My gut feeling is that some logics are wrong without leader election. 1.24 has leader election, 1.22 not. From all-machines.log with charm store 33, cookie never be synced between nodes.

[hooks/rabbitmq_server_relations.py]
    306 if is_elected_leader('res_rabbitmq_vip'):
    307 cookie = open(rabbit.COOKIE_PATH, 'r').read().strip()
    308 peer_store('cookie', cookie)

I did not test it intensively, but the logic above might be related.

Revision history for this message
Nobuto Murata (nobuto) wrote :

It can be reproduced reliably even with min-cluster-size. Attaching Juju bundle for it.

Revision history for this message
Nobuto Murata (nobuto) wrote :

ah, this scenario can be tested by improved amulet test in the future:
https://code.launchpad.net/~1chb1n/charms/trusty/rabbitmq-server/next.amulet-fix-20-delay/+merge/266958

Revision history for this message
Nobuto Murata (nobuto) wrote :

lp:~thedac/charms/trusty/rabbitmq-server/native-cluster-race-fixes revno.107 is likely to fix this issue. marking as a duplicate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.