maas provider releases all nodes it did not allocate [does not play well with others]

Bug #1081247 reported by Scott Moser
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned
juju-core
Fix Released
Low
Julian Edwards
1.16
Fix Released
High
Roger Peppe
pyjuju
Invalid
Wishlist
Unassigned
juju-core (Ubuntu)
Fix Released
Undecided
Unassigned
Saucy
Fix Released
High
Unassigned
Trusty
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
juju destroy-environment destroys all machines allocated to the MAAS user being used in the environment, not just the ones owned by Juju.

[Test Case]
Allocate machines using maas-cli
juju bootstrap
juju destroy-environment
(all machines are terminated and powered off)

[Regression Potential]
The fix is limited to the MAAS provider in the codebase; so regression potential is limited in scope of provider.

[Original Bug Report]
juju/agents/provision.py process_machines says:

 | """Ensure the currently running machines correspond to state.
 |
 | At the end of each process_machines execution, verify that all
 | running machines within the provider correspond to machine_ids within
 | the topology. If they don't then shut them down.
 |
 | Utilizes concurrent execution guard, to ensure that this is only being
 | executed at most once per process
 ...
 | # Terminate all unused juju machines running within the cluster.

This logic/description is clearly fundamentally flawed and means that a given maas user cannot have more than one juju environment on the same maas cluster.

It also means that if a user is using juju, then they cannot deploy a node in *any* other way, or juju bootstrap node will kill it for them.

I did not explicitly check, but I would suspect/hope that this behavior is not the same as the EC2 provider. Ie, I do not expect that juju kills all my running EC2 instances if I choose to type 'juju bootstrap'.

Related bugs:
 * bug 1237398: "You'll need a separate MAAS key for each Juju environment" is wrong.
 * bug 1229275: juju destroy-environment also destroys nodes that are not controlled by juju
 * bug 1239488: Juju api client cannot distinguish between environments

Related branches

Revision history for this message
Marco Ceppi (marcoceppi) wrote :

I can confirm this is not the case for EC2 as I'm able to run multiple Juju environments and non-juju'd machines in the same zone.

Revision history for this message
Scott Moser (smoser) wrote :

Yeha, so on ec2, the provider's ' get_machines' uses 'group_name' to limit its search, and ignores instances not in the specific provisioning agent's group. In maas, it does not limit the result in any way, and just returns all ALLOCATED machines.

This is likely/possibly because maas provides no way to store a bit of information like 'group_name' that can later be checked.

Revision history for this message
Kapil Thangavelu (hazmat) wrote : Re: [Bug 1081247] [NEW] maas provider terminates all unused systems

Juju on most providers tags the instance in an environment specific manner.
Ie. for ec2 or openstack it uses a security group attached to all machines
of the environment. It only kills them if a launched machine 'belongs' to
the environment but is not known to it. In the case of maas, i suspect if
this is an issue, its because there is no namespacing of the instances to
the environment, maas only knows the global pool segmented possibly by user.

-k

On Tue, Nov 20, 2012 at 1:39 PM, Scott Moser <email address hidden> wrote:

> Public bug reported:
>
> juju/agents/provision.py process_machines says:
>
> | """Ensure the currently running machines correspond to state.
> |
> | At the end of each process_machines execution, verify that all
> | running machines within the provider correspond to machine_ids within
> | the topology. If they don't then shut them down.
> |
> | Utilizes concurrent execution guard, to ensure that this is only being
> | executed at most once per process
> ...
> | # Terminate all unused juju machines running within the cluster.
>
>
> This logic/description is clearly fundamentally flawed and means that a
> given maas user cannot have more than one juju environment on the same maas
> cluster.
>
> It also means that if a user is using juju, then they cannot deploy a
> node in *any* other way, or juju bootstrap node will kill it for them.
>
> I did not explicitly check, but I would suspect/hope that this behavior
> is not the same as the EC2 provider. Ie, I do not expect that juju
> kills all my running EC2 instances if I choose to type 'juju bootstrap'.
>
> ** Affects: juju
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to juju.
> https://bugs.launchpad.net/bugs/1081247
>
> Title:
> maas provider terminates all unused systems
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1081247/+subscriptions
>

Scott Moser (smoser)
summary: - maas provider terminates all unused systems
+ maas provider unallocates all nodes it did not allocate [does not play
+ well with others]
summary: - maas provider unallocates all nodes it did not allocate [does not play
- well with others]
+ maas provider releases all nodes it did not allocate [does not play well
+ with others]
Revision history for this message
Scott Moser (smoser) wrote :

added a maas task. it really seems that maas's allocate needs to return an id or take one.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

MAAS's acquire() can already do that via constraints.

Changed in maas:
status: New → Invalid
Changed in juju:
importance: Undecided → Medium
Revision history for this message
Scott Moser (smoser) wrote :

Hm..
  I just noticed in the maas ui, that it says:
"You'll need a separate MAAS key for each Juju environment."

Since you can have multiple keys, and multiple keys registered with the same use in the cli, that is actually a reasonable solutoin.

I've tested just now, deploying with the maas-cli with a second set of keys with the same maas user while a juju bootstrap node was running, and it has not termintated the instance yet.

Changed in juju:
importance: Medium → Low
Revision history for this message
Scott Moser (smoser) wrote :

I've marked this as "wishlist", as it really seems like Juju or maas do the right thing (I'm not sure how) if you have the same MAS user with different keys.

Changed in juju:
importance: Low → Wishlist
Scott Moser (smoser)
description: updated
Curtis Hovey (sinzui)
Changed in juju:
status: New → Triaged
Curtis Hovey (sinzui)
tags: added: maas
Changed in juju-core:
status: New → Triaged
importance: Undecided → Low
Scott Moser (smoser)
description: updated
Changed in juju-core:
assignee: nobody → Julian Edwards (julian-edwards)
status: Triaged → In Progress
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: none → 1.17.0
Curtis Hovey (sinzui)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
tags: added: maas-provider
removed: maas
Changed in juju:
status: Triaged → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package juju-core - 1.16.2-0ubuntu1

---------------
juju-core (1.16.2-0ubuntu1) trusty; urgency=low

  * New upstream point release.
    (LP: #1240709, #1240927, #1246320, #1246556, #1245004)
    (LP: #1081247, #1229275, #1239508, #1240423, #1241666, #1243861).
 -- James Page <email address hidden> Thu, 31 Oct 2013 21:22:45 +0000

Changed in juju-core (Ubuntu Trusty):
status: New → Fix Released
James Page (james-page)
description: updated
James Page (james-page)
Changed in juju-core (Ubuntu Saucy):
importance: Undecided → High
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Scott, or anyone else affected,

Accepted juju-core into saucy-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/juju-core/1.16.3-0ubuntu0.13.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in juju-core (Ubuntu Saucy):
status: New → Fix Committed
tags: added: verification-needed
Revision history for this message
James Page (james-page) wrote :

Confirmed that destroying an environment only terminates machines allocated to that environment; namespacing looks good

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package juju-core - 1.16.3-0ubuntu0.13.10.1

---------------
juju-core (1.16.3-0ubuntu0.13.10.1) saucy-proposed; urgency=low

  * New upstream stable point release:
    - MAAS: juju destroy-environment also destroys nodes that are not
      controlled by juju/in different juju environments
      (LP: #1229275, #1081247).
    - MAAS: LXC container provisioning broken due to missing secrets
      in API calls (LP: #1246556).
    - MAAS: disambiguate use of environment uuid between state server
      and environment configuration (LP: #1240423).
    - local: provider fails to start due to missing setup of bootstrap
      storage (LP: #1240709).
    - local: local provider deploys fail due to inclusion of lxc package
      within LXC containers (LP: #1247299).
    - Azure: bootstrap fails due to old API version headers (LP: #1246320).
    - client: os.rename does not work on Windows (LP: #1240927).
    - simplestreams: cannot create simplestreams data for Ubuntu Trusty
      (LP: #1241666).
    - cloud-init: Embed full cloud-archive keyring in metadata to avoid
      calls to keyserver.ubuntu.com which fail in egress restricted
      data center environments (LP: #1243861).
    - core: regression - relation name regex is to restrictive (LP: #1245004).
 -- James Page <email address hidden> Thu, 21 Nov 2013 10:30:39 +0000

Changed in juju-core (Ubuntu Saucy):
status: Fix Committed → Fix Released
Revision history for this message
Stéphane Graber (stgraber) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.