The One And Only network is variously visible

Bug #1327406 reported by Mike Spreitzer
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Mike Spreitzer
Icehouse
Fix Released
High
Matt Riedemann

Bug Description

I am testing with the templates in https://review.openstack.org/#/c/97366/

I can create a stack. I can use `curl` to hit the webhooks to scale up and down the old-style group and to scale down the new-style group; those all work. What fails is hitting the webhook to scale up the new-style group. Here is a typescript showing the failure:

$ curl -X POST 'http://10.10.0.125:8000/v1/signal/arn%3Aopenstack%3Aheat%3A%3A39675672862f4bd08505bfe1283773e0%3Astacks%2Ftest4%2F3cd6160b-d8c5-48f1-a527-4c7df9205fc3%2Fresources%2FNewScaleUpPolicy?Timestamp=2014-06-06T19%3A45%3A27Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=35678396d987432f87cda8e4c6cdbfb5&SignatureVersion=2&Signature=W3aJQ6SR7O5lLOxLEQndbzNB%2FUhefr1W7qO9zNZ%2BHVs%3D'

<ErrorResponse><Error><Message>The request processing has failed due to an internal error:Remote error: ResourceFailure Error: Nested stack UPDATE failed: Error: Resource CREATE failed: NotFound: No Network matching {'label': u'private'}. (HTTP 404)
[u'Traceback (most recent call last):\n', u' File "/opt/stack/heat/heat/engine/service.py", line 61, in wrapped\n return func(self, ctx, *args, **kwargs)\n', u' File "/opt/stack/heat/heat/engine/service.py", line 911, in resource_signal\n stack[resource_name].signal(details)\n', u' File "/opt/stack/heat/heat/engine/resource.py", line 879, in signal\n raise failure\n', u"ResourceFailure: Error: Nested stack UPDATE failed: Error: Resource CREATE failed: NotFound: No Network matching {'label': u'private'}. (HTTP 404)\n"].</Message><Code>InternalFailure</Code><Type>Server</Type></Error></ErrorResponse>

The original sin looks like this in the heat engine log:

2014-06-06 17:39:20.013 28692 DEBUG urllib3.connectionpool [req-2391a9ea-46d6-46f0-9a7b-cf999a8697e9 ] "GET /v2/39675672862f4bd08505bfe1283773e0/os-networks HTTP/1.1" 200 16 _make_request /usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:415
2014-06-06 17:39:20.014 28692 ERROR heat.engine.resource [req-2391a9ea-46d6-46f0-9a7b-cf999a8697e9 None] CREATE : Server "my_instance" Stack "test1-new_style-qidqbd5nrk44-43e7l57kqf5w-4t3xdjrfrr7s" [20523269-0ebb-45b8-ad59-75f55607f3bd]
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource Traceback (most recent call last):
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/resource.py", line 383, in _do_action
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource handle())
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/resources/server.py", line 493, in handle_create
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource nics = self._build_nics(self.properties.get(self.NETWORKS))
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource File "/opt/stack/heat/heat/engine/resources/server.py", line 597, in _build_nics
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource network = self.nova().networks.find(label=label_or_uuid)
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource File "/opt/stack/python-novaclient/novaclient/base.py", line 194, in find
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource raise exceptions.NotFound(msg)
2014-06-06 17:39:20.014 28692 TRACE heat.engine.resource NotFound: No Network matching {'label': u'private'}. (HTTP 404)

Private debug logging reveals that in the scale-up case, the call to "GET /v2/{tenant-id}/os-networks HTTP/1.1" returns with response code 200 and an empty list of networks. Comparing with the corresponding call when the stack is being created shows no difference in the calls --- because the normal logging omits the headers --- even though the results differ (when the stack is being created, the result contains the correct list of networks). Turning on HTTP debug logging in the client reveals that the X-Auth-Token headers differ.

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

The token used to scale up is returned from this call:

2014-06-06 19:46:16.688 16427 DEBUG keystoneclient.session [req-96acd957-0dad-47b4-b0dd-739e83e58933 ] REQ: curl -i -X POST http://10.10.0.125:5000/v3/auth/tokens -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: python-keystoneclient" -d '{"auth": {"scope": {"OS-TRUST:trust": {"id": "f776872213bc4e63946e06db09c8f9fa"}}, "identity": {"password": {"user": {"domain": {"id": "default"}, "password": "<snip/>", "name": "heat"}}, "methods": ["password"]}}}' request /opt/stack/python-keystoneclient/keystoneclient/session.py:252

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :
Download full text (6.4 KiB)

The headers of the response to that token convey the token and little else. The body of the response is as follows.

{"token": {"OS-TRUST:trust": {"impersonation": true, "trustee_user": {"id": "647309399c7b42c3b88f8a78a5d1f38f"}, "id": "f776872213bc4e63946e06db09c8f9fa", "trustor_user": {"id": "ce6f7a9e2ba44728a651a688c8f61049"}}, "methods": ["password"], "roles": [{"id": "d00c466a45c24f4caedeb226e47d7574", "name": "heat_stack_owner"}], "expires_at": "2014-06-06T20:46:16.777062Z", "project": {"domain": {"id": "default", "name": "Default"}, "id": "39675672862f4bd08505bfe1283773e0", "name": "admin"}, "catalog": [{"endpoints": [{"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "internal", "id": "5505a74767ee4589a775b86ac3304903"}, {"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "public", "id": "e0a506195dc74585b0fdec6fea5f3f50"}, {"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "admin", "id": "e304274856ce47bf98f063c2c0cbb61c"}], "type": "cloudformation", "id": "1dd2671af86f42febb3137cbf688537f", "name": "heat"}, {"endpoints": [{"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "public", "id": "027bc57c01594b0abe8bc3c1ce8f8a94"}, {"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "admin", "id": "53015f6b829b4b6d9da099fcf0db0bf3"}, {"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "internal", "id": "647bf22563ac4406a26e2f6cee087548"}], "type": "image", "id": "208219466706419f9d7452e65cdec204", "name": "glance"}, {"endpoints": [{"url": "http://10.10.0.125:3333", "region": "RegionOne", "interface": "admin", "id": "642d6c2483234637bbaf600249bad0a2"}, {"url": "http://10.10.0.125:3333", "region": "RegionOne", "interface": "public", "id": "7e618d37a5054c4f9b108d3bf6cf2038"}, {"url": "http://10.10.0.125:3333", "region": "RegionOne", "interface": "internal", "id": "b389541a905e43e49aedc570c168be2d"}], "type": "s3", "id": "6d8ccc39e4da4673a9c18e7a80af0674", "name": "s3"}, {"endpoints": [{"url": "http://10.10.0.125:8777/", "region": "RegionOne", "interface": "admin", "id": "1564b8ff67fd4335b26d59ff76aad6e7"}, {"url": "http://10.10.0.125:8777/", "region": "RegionOne", "interface": "public", "id": "96258685ea704e6c877a0b940a43798d"}, {"url": "http://10.10.0.125:8777/", "region": "RegionOne", "interface": "internal", "id": "97d13bccb76e44fe89036f40d0038e17"}], "type": "metering", "id": "a19399f0d76f433fb0b66cb83be0e4e3", "name": "ceilometer"}, {"endpoints": [{"url": "http://10.10.0.125:35357/v2.0", "region": "RegionOne", "interface": "admin", "id": "1d4861c564e3472c8d3bcc4fb079f72a"}, {"url": "http://10.10.0.125:5000/v2.0", "region": "RegionOne", "interface": "internal", "id": "5d3ff55db4a04b16b152a89a285f830b"}, {"url": "http://10.10.0.125:5000/v2.0", "region": "RegionOne", "interface": "public", "id": "9e705cee479146188e53a6a6a2743bc0"}], "type": "identity", "id": "b5e280d77b1c4d658284f26bae7ad89f", "name": "keystone"}, {"endpoints": [{"url": "http://10.10.0.125:8774/v2/39675672862f4bd08505bfe1283773e0", "region": "RegionOne", "interface": "public", "id": "406c147497414d769b76e2a6abb01fad"}, {"url": "http://10.10.0.125:877...

Read more...

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

Let me take it from the top.

Use DevStack to install OpenStack. Use nova networking, not Neutron. Include the line

IMAGE_URLS+=,http://cloud.fedoraproject.org/fedora-20.x86_64.qcow2

in your localrc.

This will produce a system with one network; in the DB this network's "label" is "private" and its project_id is null. There will be an image named fedora-20.x86_64.

Import a key pair, and adjust the default security group to your liking.

Git clone the heat-templates project, then `cd` into it. Then

cd hot
export OS_USERNAME=admin
export OS_PASSWORD=${your chosen admin password}
export HOST_IP=${IPv4 addr of host}
export OS_AUTH_URL=http://${HOST_IP}:5000/v2.0/
export OS_TENANT_NAME=admin
heat stack-create -f asg_sampler.yaml -P key_name=${your kepair} ${some stack name}

For fun (and, possibly, relevant edification) you might compare the output of these three commands:

nova network-show private
nova net private
nova net ${UUID of that network}

I find that `nova network-show private` finds the network and prints a lot of information about it, while `nova net private` fails to find the network; when given the network's UUID, `nova net` find the network and prints just a little information about it.

Once the stack is created, look at its outputs. You will find one named "old_up_url", whose value is a long URL. Send a POST command to that. E.g.,

curl -X POST "${the value of old_up_url}"

You will find that this succeeds.

You will find another stack output named "new_up_url". POST to it. E.g.,

curl -X POST "${the value of new_up_url}"

Curl will report something like the following.

<ErrorResponse><Error><Message>The request processing has failed due to an internal error:Remote error: ResourceFailure Error: Nested stack UPDATE failed: Error: Resource CREATE failed: NotFound: No Network matching {'label': u'private'}. (HTTP 404)
[u'Traceback (most recent call last):\n', u' File "/opt/stack/heat/heat/engine/service.py", line 61, in wrapped\n return func(self, ctx, *args, **kwargs)\n', u' File "/opt/stack/heat/heat/engine/service.py", line 911, in resource_signal\n stack[resource_name].signal(details)\n', u' File "/opt/stack/heat/heat/engine/resource.py", line 879, in signal\n raise failure\n', u"ResourceFailure: Error: Nested stack UPDATE failed: Error: Resource CREATE failed: NotFound: No Network matching {'label': u'private'}. (HTTP 404)\n"].</Message><Code>InternalFailure</Code><Type>Server</Type></Error></ErrorResponse>

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :
Download full text (20.7 KiB)

Following is how it goes wrong. I will work backwards.

Remember that in the problematic case we have two levels of stack nesting: the ASG sampler is the
outermost stack; nested in that is a new-style ASG; each member of that is a nested stack made
from vm_with_cinder.yaml. The problem comes when an innermost stack asks Nova to create
a new instance (via the OS::Nova::Server resource type).

nova.db.sqlalchemy.api's model_query method interprets the project_only keyword argument ONLY IF nova.context.is_user_context(context). If project_only=True, nova.context.is_user_context(context),
and the context's project_id is not null
then the network is NOT found. If nova.context.is_user_context(context) is false then the network IS found.

In the Nova API process, when scaling up the new-style ASG, the call to list all the networks gets a context
whose to_dict() returns

{'project_name': u'admin', 'timestamp': '2014-06-10T07:48:34.689693', 'auth_token': '${ATZ}', 'remote_address': '10.10.0.125', 'is_admin': False, 'user': u'ce6f7a9e2ba44728a651a688c8f61049', 'service_catalog': [{u'endpoints': [{u'adminURL': u'http://10.10.0.125:8776/v1/39675672862f4bd08505bfe1283773e0', u'region': u'RegionOne', u'internalURL': u'http://10.10.0.125:8776/v1/39675672862f4bd08505bfe1283773e0', u'publicURL': u'http://10.10.0.125:8776/v1/39675672862f4bd08505bfe1283773e0'}], u'type': u'volume', u'name': u'cinder'}], 'read_deleted': 'no', 'user_id': u'ce6f7a9e2ba44728a651a688c8f61049', 'roles': [u'heat_stack_owner'], 'tenant': u'39675672862f4bd08505bfe1283773e0', 'request_id': 'req-50f06804-f24e-4916-9d63-cdf0834cd412', 'instance_lock_checked': False, 'project_id': u'39675672862f4bd08505bfe1283773e0', 'user_name': u'admin'}

where user ce6f7a9e2ba44728a651a688c8f61049 is admin, and project 39675672862f4bd08505bfe1283773e0 is admin. Note the "'is_admin': False" entry. This causes nova.context.is_user_context(context) to be true. Together with the non-nullity
of the context's project_id this causes the network to not be found.

The auth token I called ${ATZ} was put in that network listing request by the heat engine. The heat engine got ${ATZ} by asking Keystone to create an auth token from a trust. Following is some debug logging from the heat engine.

2014-06-10 07:48:33.149 16427 DEBUG keystoneclient.session [req-eebbc879-2667-49dc-b885-5f61f38fc179 ] REQ: curl -i -X POST http://10.10.0.125:5000/v3/auth/tokens -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: python-keystoneclient" -d '{"auth": {"scope": {"OS-TRUST:trust": {"id": "044d71255a8c41258c7986f3c2ebc238"}}, "identity": {"password": {"user": {"domain": {"id": "default"}, "password": "${the password}", "name": "heat"}}, "methods": ["password"]}}}' request /opt/stack/python-keystoneclient/keystoneclient/session.py:252

2014-06-10 07:48:33.150 16427 INFO urllib3.connectionpool [req-eebbc879-2667-49dc-b885-5f61f38fc179 ] Starting new HTTP connection (1): 10.10.0.125

2014-06-10 07:48:33.153 16427 DEBUG urllib3.connectionpool [req-eebbc879-2667-49dc-b885-5f61f38fc179 ] Setting read timeout to None _make_request /usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:375

2014-...

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :
Download full text (8.4 KiB)

For comparison purposes, following are some notes on how the old-style ASG is successfully scaled up.

When the webhook is hit, the heat engine uses the same trust as in the failure case to get a token.
Following is heat engine debug logging about that.

2014-06-10 07:47:01.168 16427 DEBUG keystoneclient.session [req-240539df-e1d6-49a2-bd40-da4e37eeadd7 ] REQ: curl -i -X POST http://10.10.0.125:5000/v3/auth/tokens -H "Content-Type: application/json" -H "Accept: application/json" -H "User-Agent: python-keystoneclient" -d '{"auth": {"scope": {"OS-TRUST:trust": {"id": "044d71255a8c41258c7986f3c2ebc238"}}, "identity": {"password": {"user": {"domain": {"id": "default"}, "password": "tempusFugitive", "name": "heat"}}, "methods": ["password"]}}}' request /opt/stack/python-keystoneclient/keystoneclient/session.py:252
2014-06-10 07:47:01.169 16427 INFO urllib3.connectionpool [req-240539df-e1d6-49a2-bd40-da4e37eeadd7 ] Starting new HTTP connection (1): 10.10.0.125
2014-06-10 07:47:01.169 16427 DEBUG urllib3.connectionpool [req-240539df-e1d6-49a2-bd40-da4e37eeadd7 ] Setting read timeout to None _make_request /usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:375
2014-06-10 07:47:01.287 16427 DEBUG urllib3.connectionpool [req-240539df-e1d6-49a2-bd40-da4e37eeadd7 ] "POST /v3/auth/tokens HTTP/1.1" 201 6351 _make_request /usr/lib/python2.7/dist-packages/urllib3/connectionpool.py:415
2014-06-10 07:47:01.287 16427 DEBUG keystoneclient.session [req-240539df-e1d6-49a2-bd40-da4e37eeadd7 ] RESP: [201] CaseInsensitiveDict({'x-subject-token': '${ATW}', 'vary': 'X-Auth-Token', 'content-length': '6351', 'content-type': 'application/json', 'date': 'Tue, 10 Jun 2014 07:47:01 GMT'})
RESP BODY: {"token": {"OS-TRUST:trust": {"impersonation": true, "trustee_user": {"id": "647309399c7b42c3b88f8a78a5d1f38f"}, "id": "044d71255a8c41258c7986f3c2ebc238", "trustor_user": {"id": "ce6f7a9e2ba44728a651a688c8f61049"}}, "methods": ["password"], "roles": [{"id": "d00c466a45c24f4caedeb226e47d7574", "name": "heat_stack_owner"}], "expires_at": "2014-06-10T08:47:01.256569Z", "project": {"domain": {"id": "default", "name": "Default"}, "id": "39675672862f4bd08505bfe1283773e0", "name": "admin"}, "catalog": [{"endpoints": [{"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "internal", "id": "5505a74767ee4589a775b86ac3304903"}, {"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "public", "id": "e0a506195dc74585b0fdec6fea5f3f50"}, {"url": "http://10.10.0.125:8000/v1", "region": "RegionOne", "interface": "admin", "id": "e304274856ce47bf98f063c2c0cbb61c"}], "type": "cloudformation", "id": "1dd2671af86f42febb3137cbf688537f", "name": "heat"}, {"endpoints": [{"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "public", "id": "027bc57c01594b0abe8bc3c1ce8f8a94"}, {"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "admin", "id": "53015f6b829b4b6d9da099fcf0db0bf3"}, {"url": "http://10.10.0.125:9292", "region": "RegionOne", "interface": "internal", "id": "647bf22563ac4406a26e2f6cee087548"}], "type": "image", "id": "208219466706419f9d7452e65cdec204", "name": "glance"}, {"endpoints": [{"url": "http://10.10.0.125:333...

Read more...

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

Assuming that when a nova network's project_id field in the database is null this means the network is public, I would think the bug is in Nova's nova/network/api.py, where class API has the following method:

wrap_check_policy
    def get_all(self, context):
        """Get all the networks.

        If it is an admin user, api will return all the networks,
        if it is a normal user, api will only return the networks which
        belong to the user's project.
        """
        try:
            return self.db.network_get_all(context, project_only=True)
        except exception.NoNetworksFound:
            return

The call on network_get_all should pass project_only="allow_none" instead of True.

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

This filtering according to project was introduced in the fix to https://bugs.launchpad.net/nova/+bug/1186867 --- that is, https://review.openstack.org/#/c/31481/

The discussion in that bug does not consider the case of a network that should be visible to users in all projects.

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

I submitted a fix for review, see https://review.openstack.org/#/c/99068/

I do not know why a comment about that was not automatically added here.

Tracy Jones (tjones-i)
tags: added: network
Matt Riedemann (mriedem)
Changed in nova:
status: New → In Progress
assignee: nobody → Mike Spreitzer (mike-spreitzer)
summary: - OS::Heat::AutoScalingGroup scale up fails to find networks
+ publicly-visible nova network not always visible
no longer affects: heat
summary: - publicly-visible nova network not always visible
+ The One And Only network is variously visible
Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

I suppose I am not sure where the problem really is.

Let's start with the intent for the installation done by DevStack. Consider that one and only network created by DevStack for nova networking. Is that network:

(1) initially in a strange mode where it is intended for use only by admin users, awaiting dedication to some ordinary project,

or

(2) initially in a state where it is intended for use by every user?

Put another way, this is a Nova question: when network's project_id is null in the database, does this mean the network is hiding or does it mean the network should be visible to all tenants?

I guessed the answer is (2), for a number of reasons. One is that the alternative means DevStack has created an installation that is not ready for use by any but administrative users, which seems bad. Another reason is that (2) is supported by experiment. When I set my OS_ envars to user and project being demo, `nova boot ... --nic net-id=${id of that network}` succeeds in creating a VM. Also the `nova` commands to list and examine networks mostly expose that network; the one exception is `nova net private` (which is an exception even when authenticating as admin/admin).

Finally, and transitionally, (1) would imply that there is a booby trap in DevStack+Heat. Note that I could create both a new-style and an old-style autoscaling group, and can scale up and down the old-style one, and can scale down the new-style one. In none of that did I get any hint that I had done something wrong. But when I tried to scale up the new-style group, it failed. Should that really be the behavior in the nova networking environment produced by DevStack?

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :
Download full text (3.2 KiB)

When authenticating as user=admin=tenant, I get inconsistent CLI behavior regarding the visibility of the one and only network created by DevStack:

ubuntu@mjs-dstk-531:~$ nova network-show private
+---------------------+--------------------------------------+
| Property | Value |
+---------------------+--------------------------------------+
| bridge | br100 |
| bridge_interface | eth0 |
| broadcast | 10.11.63.255 |
| cidr | 10.11.0.0/18 |
| cidr_v6 | - |
| created_at | 2014-06-03T23:54:29.000000 |
| deleted | 0 |
| deleted_at | - |
| dhcp_start | 10.11.0.2 |
| dns1 | 8.8.4.4 |
| dns2 | - |
| gateway | 10.11.0.1 |
| gateway_v6 | - |
| host | mjs-dstk-531 |
| id | a176a5fe-efb5-4ad4-bd8b-831e78d1b957 |
| injected | False |
| label | private |
| multi_host | False |
| netmask | 255.255.192.0 |
| netmask_v6 | - |
| priority | - |
| project_id | - |
| rxtx_base | - |
| updated_at | 2014-06-04T00:14:23.000000 |
| vlan | - |
| vpn_private_address | - |
| vpn_public_address | - |
| vpn_public_port | - |
+---------------------+--------------------------------------+
ubuntu@mjs-dstk-531:~$ nova net private
ERROR (NotFound): Network not found (HTTP 404) (Request-ID: req-97eef676-96e0-448f-8185-319b0cae7471)
ubuntu@mjs-dstk-531:~$ nova net a176a5fe-efb5-4ad4-bd8b-831e78d1b957
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| cidr | 10.11.0.0/18 |
| id | a176a5fe-efb5-4ad4-bd8b-831e78d1b957 |
| label | private |
+----------+--------------------------------------+
ubuntu@mjs-dstk-531:~$

Note also that `nova network-associate-project` is not available in this installation:

ubuntu@mjs-dstk-531:~$ nova network-associate-project a176a5fe-efb5-4ad4-bd8b-831e78d1b957
ERROR (HttpNotImplemented): VLAN support must be enabled (HTTP 501) (Request-ID: req-1a30a0cd-6e8f-4e7a-9836-40c8f027fc17)

FYI, in the [DEFAULT] section of my /etc/nova/nova.conf I see

network_manager = no...

Read more...

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

While using "network_manager = nova.network.manager.FlatDHCPManager", no network ever gets a non-null project_id.

A network with a null project_id is not visible to non-admin users.

For example, consider the one and only network created by DevStack. When you authenticate as admin/admin, the CLI shows you this network:

$ nova net-list
+--------------------------------------+---------+--------------+
| ID | Label | CIDR |
+--------------------------------------+---------+--------------+
| 3f539f5c-7cf4-4106-9115-4049fed6d7f4 | private | 10.11.0.0/18 |
+--------------------------------------+---------+--------------+

When you authenticate as demo/demo, the CLI does not show you this network:

$ nova net-list
+----+-------+------+
| ID | Label | Cidr |
+----+-------+------+
+----+-------+------+

(`nova network-list` produces the same result as `nova net-list`; I wonder why they both exist and where, if anywhere, the answer is documented?)

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

https://bugs.launchpad.net/openstack-manuals/+bug/1152862 reports (accurately, as far as I can tell) that those nova CLI commands and the related API calls are not adequately documented (which could make "fixing" them debatable, but I hope the following is uncontroversial).

The fact that a non-admin user sees zero networks in the case of Nova networking without VLANs is clearly a bug.

The fact that scaling up an OS::Heat::AutoScalingGroup created by admin causes Nova to be invoked with a context that says "'is_admin': False" looks suspicious to me.

BTW, I tested with the asg_of_servers.yaml template (which is simpler, only one level of nested template) in https://review.openstack.org/#/c/97366/ and found the same problem there.

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

Here is a simple bad consequence for Heat of the Nova issue. Use the flat DHCP variety of nova networking (e.g., use DevStack's default network config). Consider the simple vm_with_cinder.yaml template of https://review.openstack.org/#/c/97366/ --- or an even simpler one that just has an OS::Nova::Server on a network. You will find that when authenticating as an administrative user you can create a stack from this template --- but when authenticating as a non-administrative user you can NOT create a stack from that template (the stack-create will succeed but the stack will go into a failed state, with a complaint on the OS::Nova::Server about being unable to find the network).

Revision history for this message
Steven Hardy (shardy) wrote :

> The fact that scaling up an OS::Heat::AutoScalingGroup created by admin causes Nova to be invoked with a context that says "'is_admin': False" looks suspicious to me.

This is expected, as by default the trust heat creates doesn't delegate the admin role from the stack owner to the heat service user:

trusts_delegated_roles=heat_stack_owner

This is a list, and the "heat_stack_owner" is really a placeholder role, in real deployments I'd expect this to be some list of roles which enables access to the various services.

For test environments it could even be "Member", we just need a role to delegate. When we consume the trust (during the autoscaling action) heat impersonates the stack owner but *only* with the roles in the list (not all the roles of the stack owner, as you've discovered).

You can set that to trusts_delegated_roles=[heat_stack_owner, admin], but then everyone creating a heat stack would need to be admin.

Basically, heat doesn't expect stack owners to need to be admin as it's a user-facing service - that's why we don't have e.g resources for creating keystone users/projects/domains - they'd be useless to everyone except admins and confusing because we don't yet have any mechanism to hide resources based on roles/policy.

So, the bug here appears to be that getting the network details from nova is admin-only AFAICT? Is there some other way we can get that data without needing the admin role, or can the nova default policy be changed to allow non admin users to see the network?

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

Removed DevStack and Heat from the affected projects list based on Steve's affirmation that things outside Nova are working as expected.

no longer affects: devstack
no longer affects: heat
Revision history for this message
Vish Ishaya (vishvananda) wrote :

So unfortunately there isn't a right answer for all nova-network. Vlan mode should return:

self.db.network_get_all(context, project_only=True)

whereas flatdhcp should return

self.db.network_get_all(context, project_only="project_only")

so officially this should probably be an rpc call to nova-network. That solution isn't really
backportable so we might need a hacky check for the network manager setting in network api.

Revision history for this message
Vish Ishaya (vishvananda) wrote :

officially flat dhcp should filter the vlan networks according to the implementation although I might argue that the filtering should be removed:

[network for network in networks if not network.vlan]

https://github.com/openstack/nova/blob/33c1a195c8feae141ed09d8a7e2a368c0aa351c8/nova/network/manager.py#L485

I've been discussing ways of uniting the network managers and aside from the old cloudpipe handling vlan manager and flatdhcp manager only have two real differences:

1) flatdhcp manager filters out vlan networks when giving the network options (which seems silly)
2) vlan managers hide project_id=None networks and autoallocate them as needed.

Seems like we could deprecate vlan manager pretty easily with an new config option or two.

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

For flat (DHCP or not) nova networking you mean that in nova/network/api.py's API class the get_all method should call

            return self.db.network_get_all(context, project_only="allow_none")

rather than ... "project_only" ... right?

For the VLAN case, it is arguable that project_only="allow_none" is also reasonable; the difference is that that includes networks that are "available" in the sense of "could be assigned". The positive and negative examples in https://wiki.openstack.org/wiki/APIChangeGuidelines do not fit this question really well.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/101724

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

@Vishy, I just drafted that hacky fix that depends on configuration. Please have a look.
I am going to abandon to unconditional fix in favor of this one.

I have tested this only with flat DHCP nova neworking; I am not sure how to make DevStack give me a working setup with another kind of networking. In fact, I am not even sure how to configure the floating IP range for flat DHCP nova networking (see http://lists.openstack.org/pipermail/openstack-dev/2014-June/038303.html).

In short, this needs testing in the other configurations.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Mike Spreitzer (<email address hidden>) on branch: master
Review: https://review.openstack.org/99068
Reason: See https://review.openstack.org/#/c/101724/ for preferred approach.

Tracy Jones (tjones-i)
Changed in nova:
importance: Undecided → High
Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

For those who only read the last comment, note that there is a fix in review: https://review.openstack.org/#/c/101724/

Revision history for this message
Mike Spreitzer (mike-spreitzer) wrote :

Note also that there is a change in progress that adds a tempest test that exposes this bug: https://review.openstack.org/#/c/112944/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/101724
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=46e88320e6e6231550f3e2b40312c51f55e059f5
Submitter: Jenkins
Branch: master

commit 46e88320e6e6231550f3e2b40312c51f55e059f5
Author: Mike Spreitzer <email address hidden>
Date: Sun Jun 22 02:09:17 2014 +0000

    Made unassigned networks visible in flat networking

    This change fixes a bug in Nova's
    GET /v2/{tenant_id}/os-networks

    The doc
    http://docs.openstack.org/api/openstack-compute/2/content/GET_os-networks-v2_ListNetworks__v2__tenant_id__os-networks_ext-os-networks.html
    says that "Lists networks that are available to the tenant".

    When invoked by a non-admin user it was returning only networks
    assigned to the user's tenant. But in flat and flat DHCP nova
    networking, networks CAN NOT be assigned to tenants --- thus a
    non-admin user would get zero networks from this operation.

    The fix was to make this operation conditionally use an option already
    present in the lower-level code to "allow_none" when fetching the list
    of networks, meaning to include networks whose project_id field in the
    DB held "none" (meaning the network is not assigned to a tenant). The
    condition under which this is done is that the Nova configuration
    option named network_manager contains the string
    "'nova.network.manager.Flat" --- which is true for flat and flat DHCP
    nova networking and false for VLAN nova networking.

    Change-Id: I64c5a3f31c912cca6b5b9987152ba7a9b3f5987d
    Closes-Bug: #1327406

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/118748

Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-3
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/118748
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=f93b8ee75ef457a2843ea6b551cd4c9dd27fbaa8
Submitter: Jenkins
Branch: stable/icehouse

commit f93b8ee75ef457a2843ea6b551cd4c9dd27fbaa8
Author: Mike Spreitzer <email address hidden>
Date: Sun Jun 22 02:09:17 2014 +0000

    Made unassigned networks visible in flat networking

    This change fixes a bug in Nova's
    GET /v2/{tenant_id}/os-networks

    The doc
    http://docs.openstack.org/api/openstack-compute/2/content/GET_os-networks-v2_ListNetworks__v2__tenant_id__os-networks_ext-os-networks.html
    says that "Lists networks that are available to the tenant".

    When invoked by a non-admin user it was returning only networks
    assigned to the user's tenant. But in flat and flat DHCP nova
    networking, networks CAN NOT be assigned to tenants --- thus a
    non-admin user would get zero networks from this operation.

    The fix was to make this operation conditionally use an option already
    present in the lower-level code to "allow_none" when fetching the list
    of networks, meaning to include networks whose project_id field in the
    DB held "none" (meaning the network is not assigned to a tenant). The
    condition under which this is done is that the Nova configuration
    option named network_manager contains the string
    "'nova.network.manager.Flat" --- which is true for flat and flat DHCP
    nova networking and false for VLAN nova networking.

    Conflicts:
            nova/network/api.py
            nova/tests/network/test_api.py

    NOTE(mriedem): The conflicts are due to the objects conversion
    in Juno. This cherry pick adds the related network_get_all unit
    tests from Juno and makes them work with the DB API in Icehouse.
    Also note the oslo.config import in test_api is removed since it
    was erroneously merged into master and fixed with commit
    0e88148907e1db5218b96e2fa54bf9fee1cba74f but rather than backport
    that and squash it with this, I've just removed it.

    Change-Id: I64c5a3f31c912cca6b5b9987152ba7a9b3f5987d
    Closes-Bug: #1327406
    (cherry picked from commit 46e88320e6e6231550f3e2b40312c51f55e059f5)

tags: added: in-stable-icehouse
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.