[EDGY] rgmanager does not switch services correctly when failover domai is configured

Bug #61854 reported by Fabio Massimo Di Nitto
4
Affects Status Importance Assigned to Milestone
Red Hat Cluster
Won't Fix
Medium
redhat-cluster-suite (Ubuntu)
Fix Released
High
Fabio Massimo Di Nitto

Bug Description

This is the last regression i can see upgrading from dapper and it is also tracked upstream.

Fabio

Revision history for this message
In , Fabio (fabio-redhat-bugs) wrote :

Version-Release number of selected component (if applicable):

latest CVS checkout as of 2006/08/23 at 10am UTC

How reproducible:

99.9% of the time.

Steps to Reproduce:
1. setup a 2 nodes cluster
2. configure one failover domain with one shared ip service
3. start the cluster, make sure the cluster is formed and quorated
4. start rgmanager on both the nodes
5. make sure the shared ip service is started and rgmangares are running properly.
6. stop rgmanager on the node that owns the service

Actual results:

the service is not migrated on the other node

Expected results:

the service should migrate to the other node

Additional info:

Only one time it did happen that the service was actually migrated
but nothing different from the other times was done.

No special messages are left in daemons.log.

Lon if you need anything like logs or configs, please let me know.
The cluster is running ipv4 all over. No ipv6 involved yet.

Fabio

Revision history for this message
In , Lon (lon-redhat-bugs) wrote :

After being unable to reproduce this, I just noticed that I need a failover
domain. ;)

Could you post your config when you get a chance?

Revision history for this message
In , Fabio (fabio-redhat-bugs) wrote :
Changed in redhat-cluster-suite:
assignee: nobody → fabbione
importance: Untriaged → High
status: Unconfirmed → Confirmed
Changed in redhatcluster:
status: Unknown → In Progress
Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

This fix unfortunatly requires an UVF for openais and the redhat-cluster-suite. The problem at the base are a set of bugs in the messagging system that are fixed in recent versions. changes to backport are more complicated (and prone to errors) than breaking UVF. NOTE that the only openais client atm is the cluster suite so there is no danger of breaking other stuff.

Fabio

Changed in redhat-cluster-suite:
status: Confirmed → Fix Committed
Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

comments on the commits are in the changelog header.

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

These bugs have been fixed by me and Lon in the BSP we had together. All of them have been fixed and tested on 3 clusters (one mine and 2 from Lon). Patches eyeballed also by other RH's guys that were partecipating with us.

There are other small changes in cman and groupd to accomodate the new openais but they are only 2/3 one liners to make everybody happy.

Fabio

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 61854] Re: [EDGY] rgmanager does not switch services correctly when failover domai is configured

On Thu, Sep 28, 2006 at 05:16:09AM -0000, Fabio Massimo Di Nitto wrote:
> These bugs have been fixed by me and Lon in the BSP we had together. All
> of them have been fixed and tested on 3 clusters (one mine and 2 from
> Lon). Patches eyeballed also by other RH's guys that were partecipating
> with us.
>
> There are other small changes in cman and groupd to accomodate the new
> openais but they are only 2/3 one liners to make everybody happy.

Looks fine.

--
 - mdz

Revision history for this message
Fabio Massimo Di Nitto (fabbione) wrote :

Fix uploaded.

Changed in redhat-cluster-suite:
status: Fix Committed → Fix Released
Changed in redhatcluster:
status: In Progress → Rejected
Revision history for this message
In , Nate (nate-redhat-bugs) wrote :

Moving all RHCS ver 5 bugs to RHEL 5 so we can remove RHCS v5 which never existed.

Changed in redhatcluster:
status: Invalid → Fix Released
Changed in redhatcluster:
importance: Unknown → Medium
status: Fix Released → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.