[HEAT] Support anti-affinity for node processes in many node groups during scaling

Bug #1268610 reported by Alexander Ignatov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Sahara
Fix Released
Medium
Andrew Lazarev

Bug Description

Heat engine doesn't allow users to define clusters with anti-affinity enabled on node process shared across many node groups.
For example the following cluster structure will not be scaled properly or scaled with error:

{
    "name": "cluster",
    "plugin_name": "vanilla",
    "hadoop_version": "1.2.1",

    "anti_affinity": ["datanode"],

    "node_groups": [
        {
            "name": "master",
            "count": 1,
            "node_processes": ["namenode"],
             ...skipped...
        },
        {
            "name": "worker1",
            "node_processes": ["datanode"],
            "count": 2
        },
         {
            "name": "worker1",
            "node_processes": ["datanode"],
            "count": 2
        }
    ]
}

So in the above cluster structure "datanode" processes will not be scaled (up or down) properly after cluster provisioning.
The reason of that are nova scheduler hints definitions in heat templates which should be updated after each scaling operations. At the same time heat template already contains scheduler hints for provisioned instances in cluster with anti-affinity enabled and newly generated template during scaling leads to remove/create instances which are not dedicated to be removed/added.

Tags: engine.heat
Changed in savanna:
importance: High → Medium
Changed in savanna:
milestone: icehouse-3 → next
summary: - Support anti-affinity for node processes in many node groups during
- scaling
+ [HEAT]Support anti-affinity for node processes in many node groups
+ during scaling
summary: - [HEAT]Support anti-affinity for node processes in many node groups
+ [HEAT] Support anti-affinity for node processes in many node groups
during scaling
Changed in sahara:
milestone: next → juno-1
Changed in sahara:
milestone: juno-1 → juno-2
Changed in sahara:
milestone: juno-2 → juno-3
Changed in sahara:
assignee: nobody → Andrew Lazarev (alazarev)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to sahara (master)

Fix proposed to branch: master
Review: https://review.openstack.org/112159

Changed in sahara:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to sahara (master)

Reviewed: https://review.openstack.org/112159
Committed: https://git.openstack.org/cgit/openstack/sahara/commit/?id=4b3910f8c58f4b4d9714c70cadfd4c35040382a9
Submitter: Jenkins
Branch: master

commit 4b3910f8c58f4b4d9714c70cadfd4c35040382a9
Author: Andrew Lazarev <email address hidden>
Date: Mon Aug 25 17:48:20 2014 -0700

    Switched anti-affinity feature to server groups

    Switched implementation of anti affinity to server groups for both
    direct and heat engine.

    Direct engine change is backward compatible. Sahara will detect if
    old logic is used and will use the same logic for cluster scaling.

    Note, behavior of anti-affinity changed. Now Sahara will create one
    server group for cluster and will assign all affected instances to it.
    So, if anti-affinity enabled for datanode (`dn`) and tasktracker (`tt`)
    Sahara will not assign node with `dn` and node with `tt` to the same
    compute.

    Also note, that server group support will be added to heat only in
    juno-3. So, environment with up-to-date heat is required.

    Closes-Bug: #1268610
    Implements: blueprint anti-affinity-via-server-groups
    Change-Id: I501438d84f3a486dad30081b05933f59ebab4858

Changed in sahara:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in sahara:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in sahara:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.