Resource ids are not unique per tenant

Bug #1617918 reported by Jake Yip
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Gnocchi
Fix Released
Medium
Julien Danjou

Bug Description

Dear Gnocchi Devs,

I've been trying out gnocchi and came to realised that resource ids are not unique per project.

Also, the resource's uuid seems to be generated from https://github.com/openstack/gnocchi/blob/stable/2.2/gnocchi/utils.py#L25-L43, which means that gnocchi resource names (e.g. original_resource_id) are in the same namespace across projects too.

This means that

1) When a user tries to create a resource of the same name as an existing resource (in another project), they get a HTTP 409. However, they can't see the resource using a `gnocchi resource list`. E.g.

  $ gnocchi resource create test
  Resource 7fd19145-920f-5b9c-be0a-2146b0c39949 already exists (HTTP 409)
  $ gnocchi resource list

  $

2) Given that there are no quotas (that I know of), it might be trivial to DoS gnocchi resources across the whole system.

(2) is not so much of a problem, but I can see (1) being kinda annoying to users when gnocchi is rolled out. Everyone likes to do a `gnocchi resource create test`.

I wonder if there is a reason behind this design decision? Please let me know if this is a known bug or a WONTFIX.

Julien Danjou (jdanjou)
Changed in gnocchi:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Julien Danjou (jdanjou) wrote :

This is more historical than a thought design decision. The resource are supposed to be always created using an UUID, not any name. We added later the name-to-uuid transformation to simplify some use cases (for OpenStack). But it means a name can only be used once across a whole Gnocchi. I think Amazon S3 has the same kind of limitation for the bucket name, no? (just trying to justify that this may not be a problem :-)

Resource ID are supposed to be unique across a Gnocchi deployment and I don't think it's a problem, as there should not be any collision there – it's UUID, there's a large space for everyone.

But the name-to-uuid transformation indeed poses some kind of silliness in this regard.

I'm open to any suggestion.

Oh, and I think the lack of quota is completely orthogonal anyway. It's a missing feature, definitely (probably worth a different bug).

Revision history for this message
Jake Yip (waipengyip) wrote :

Thanks Julien for replying! Hopefully this discussion can help other gnocchi users with the same questions.

I agree that resource ids should be unique across the system. Maybe the name should not be? Just thinking along the lines of nova names, cinder names and uuid...

Maybe name-to-uuid transformation can be better. Off the top of my mind, a possible improvement can be to use the tenant uuid to generate the resource uuid. What do you think?

Of course we have to take care of existing resource, which can be handled via the upgrade script.

Re: quota, we can handle this separately. I was just making a point that dictionary collisions occur much more easily/frequently then UUID collision.

Revision history for this message
xiaozhuangqing (xiaozhuangqing) wrote :

Hi Julien:
   Does has no name field?? can I wen add a filed of name to gnocchi resource ?
I hope I can do it ^^

Revision history for this message
Julien Danjou (jdanjou) wrote :

I thought about using the name-to-uuid transformation based on another UUID: the problem is that the project_id is not always an UUID, so that is not possible. :(

Revision history for this message
gordon chung (chungg) wrote :

name-to-uuid transform both resource and project?

hmm.. yeah. i think the issue is that `gnocchi resource create <value>` is actually taking in the identifier, which should be unique, rather than a name.

Revision history for this message
gordon chung (chungg) wrote :

although, now that i think about it, i don't think we can change name-to-uuid logic to incoporate project-id or else it completely destroys the current name-to-uuid mapping?

Revision history for this message
Jake Yip (waipengyip) wrote :

@gordon, do you think we can migrate the uuid/names to the new schema during an upgrade by running the gnocchi-upgrade script?

Revision history for this message
gordon chung (chungg) wrote :

yes, we probably could. that doesn't fix the fact that when we run 'gnocchi resource show <uuid>' it would again fail probably because uuid is now no longer unique.

Revision history for this message
gordon chung (chungg) wrote :

here's a proposal. how about we just allow the ability to create resource and have id optional. if you don't submit one, we just generate one. this way we dont' need any migrations and all data is still valid.

Revision history for this message
Jake Yip (waipengyip) wrote :

I wonder if we could make 'gnocchi resource show <name>' use the name-to-uuid mapping and 'gnocchi resource show <uuid>' to use the uuid directly? I was under the impression that this is how gnocchi does things now, i.e. do a name-to-uuid conversion only if the the provided param is not already an uuid.

summary: - Resource UUID are not unique per tenant
+ Resource names are not unique per tenant
Revision history for this message
gordon chung (chungg) wrote : Re: Resource names are not unique per tenant

that is already how it happens. if you pass in a name aka. non-uuid, it will run through a translation method to switch it to uuid. the problem is that translation does not account for project which causes your conflict. if we take project into account, it will require us to migrate all the previous data as that data didn't use project when it did translation. maybe we do need migration, but i'm trying to think of a solution which doesn't :)

i guess my solution doesn't let you query by name which might be required?

i'm good with either way. don't have time to work on it currently though.

Revision history for this message
Jake Yip (waipengyip) wrote :

Yes that's exactly what I meant: use a new name-to-uuid algorithm that takes into account project uuid. And also migrate data using the upgrade script at the same time.

It is not terribly urgent for us at this point in time, but might be when we get more users storing their own data in gnocchi. I will see if I can fix it when I get some time.

I hope that having a bug report will help point users in the correct direction when they encounter this problem. And please leave comments / upvote if you would like this change.

Revision history for this message
Sam Morrison (sorrison) wrote :

Where this caught me is with the resource type of swift_account. I had created a custom resource type called "project" with the idea of storing some metrics against openstack projects. This however failed as the swift_account resource type already had resources with the same project IDs (swift_acocunt is really a project, it is poorly named).

My fix was to change the gnocchi_resources.yaml file in ceilometer to store the swift metrics under my "project" resource type as opposed to the "swift_account" resource type

Julien Danjou (jdanjou)
summary: - Resource names are not unique per tenant
+ Resource ids are not unique per tenant
Revision history for this message
gordon chung (chungg) wrote :

i think we should do what Jake suggests... any other options?

Revision history for this message
gordon chung (chungg) wrote :

man this sucks. i attempted to implement this and i realise it looks very strange if we implement project-scoped ids using the current mechanism we have. specifically, when you rest interface, if i ask for /v1/resources/blah/metrics/ i will get something completely different from /v1/resources/blah/metrics depending what project i'm authenticated with. the alternative is to create a true id column and treat everything we've done so far as a 'name' but this will completely change paths and make the to-uuid logic moot.

Revision history for this message
Julien Danjou (jdanjou) wrote :

Current thoughts:

* Merge created_by_user/project into one field named "owner" which can also contain domains (e.g. domain:user:project), so we don't care about authentication type anymore
* Each auth_type can map its own headers to the "owner" field for ACL
* Remove user_id,project_id from generic – each auth type module can map this as it wants
* Make (resource_id,owner) unique

Changed in gnocchi:
assignee: nobody → Julien Danjou (jdanjou)
Revision history for this message
gordon chung (chungg) wrote :

i don't really undertand the first 3 bullets...

so if i request /v1/resource/generic/<id>, what will i get back? whatever project i'm part of? how do i get another details for same id but in another project? i need to pass in a project param and/or re-authenticate?

Revision history for this message
Julien Danjou (jdanjou) wrote :

If you request /v1/resource/generic/<id> you will get the resource for that id for the project your are authed for.

When you do a request a token is always scoped to user/project/domain with Keystone. So if you want to do the same request for a different project you need a different token.

Revision history for this message
gordon chung (chungg) wrote :

does that mean role:admin doesn't actually mean much anymore? i believe you previously could see everything? if you have admin now, no matter what you're scoped to a project?

Revision history for this message
Julien Danjou (jdanjou) wrote :

So each module of auth (keystone, noauth) will be able to implement its own scheme in this regard.

For Keystone, I think that the listing will be able to list everything, that's no problem. For /v1/resource/generic/<id> we will probably need the admin to add a different header than X-Project-Id, such as X-For-Project-Id :)

Revision history for this message
Sam Morrison (sorrison) wrote :

This is really just an issue when getting data from ceilometer as gnocchi is using IDs from another system as it's own unique ID.

Wouldn't the correct solution be to use a unique uuid for a resource that is generated by gnocchi and storing eg. in the case of a nova instance the uuid as an attribute of that?

Revision history for this message
gordon chung (chungg) wrote :

@jd, it just seems a little weird that if i call /v1/resource/generic/<id> is a different thing depending on what credentials i've auth with. seems like a redirect that normally a client would hide.

@Sam, i think that's the cleaner solution but it will break all existing queries (against REST at least. not necessarily end of world but it is what it is. it will also probably require the most changes.

Revision history for this message
Julien Danjou (jdanjou) wrote :

The problem with Sam's solution is that you cannot know the URL of a resource in advance without knowing its random id. Which kinda kill the workflow of "gnocchi resource show <instance-id>" inside OpenStack.

Revision history for this message
gordon chung (chungg) wrote :

right, i wonder how nova works. it allows you to pass in a name or id when running 'nova show'. i have a preference that whatever url you use and letting client handle all the redirect / id-masking.

Revision history for this message
gordon chung (chungg) wrote :

i should mention i don't have a code so if you have code for your path i don't want to stop you (not that i can :P)

Revision history for this message
Julien Danjou (jdanjou) wrote :

I dug into Nova client source code, and what it does is a detailed listing/search to get the right resource and tries to guess if the <instance-id> is an id or an human name (because it also works with the display_name of an instance).

Revision history for this message
Julien Danjou (jdanjou) wrote :

I've started working on that, and I've almost finished a PoC of fixing this. It changes the indexer so the resources in database are not usind (id) as primary key, but (id, creator). This works fine, except for the one case – introduced by Ceilometer – where you want to access a resource that you did not created, e.g.:

GET /v1/resource/instance/<someuuid>

If the resource <someuuid> have been created by Ceilometer but your project_id is in it, the current policy allows you to get the resource and its information. With the new scheme, this is not possible anymore because <someuuid> can exists 10 times in the database (e.g. 1 created by Ceilometer and 9 others created by random people). So it'd be very costly to check that which of the 10 you have access to, and if 2 are accessible, that's a problem as the API was not designed to return more than one resource on that URL.

The solution to that problem is either:
1. Break that use case, so there's no sharing of resources/data anymore between tenants using Gnocchi. So in the case of Ceilometer, no user will be able to access data pushed by it anymore.
2. Add a header called X-Created-By, which if provided when requesting "GET /v1/resource/instance/<someuuid>" would specified which resource is trying to be accessed. In the Ceilometer use case, that'd be set to <ceilometer_user_id>:<ceilometer_project_id> and that'll retrieve the right resource.

Another totally different solution that I just thought about, is to keep the current resource id to be the primary key – so id is still unique. However, since this bug is mainly around having the ability to do "gnocchi create resource foobar" with foobar meaning 2 different id for different users, we could change the UUID5 generation mechanism from:

  uuid.uuid5(RESOURCE_ID_NAMESPACE, resource_id_as_string_or_whatever)

to

  uuid.uuid5(RESOURCE_ID_NAMESPACE, resource_id_as_string_or_whatever + ":" + creator)

which would solve that problem quite easily. There *might* be some breakage for people having hardcoded those generated ID somewhere, but if it's not hardcoded, it should be pretty to make work and to upgrade to this new scheme.

Revision history for this message
gordon chung (chungg) wrote :

the last scenario will still make GET /v1/resources/uuid return different resources depending auth headers right? i think that's kind of weird to be honest.

my use case is using some 'admin' id, i want to be able to work against all metrics of a project regardless of who/what pushed it into Gnocchi. it seems like that could be handled by first solution?

Revision history for this message
Julien Danjou (jdanjou) wrote :

"the last scenario will still make GET /v1/resources/uuid return different resources depending auth headers right? i think that's kind of weird to be honest."

No, because the ID is a primary key, so all resources have a unique id.

The thing is that if you do:
$ gnocchi resource create foobar
as user "jd", the resource id will be computed based on (foobar, jd) and let's say abc123.

If you gordc create a resource with:
$ gnocchi resource create foobar
your resource id will be (foobar, gordc) and it'll be let's say def456.

If I want to access your resource because e.g. you set a project_id to something that we both share <3 I can do:

$ gnocchi resource show foobar --created-by gordc
and in this case the client will compute the UUID with (foobar, gordc), which will show me the resource you created.

So all the ID are computed client side so that works.

The only case where what you describe happen is if I use the UUID computing on the server side, e.g. GET /v1/resource/instance/foobar, i.e. using "foobar" rather than the computed id. In this case the server will compute the uuid to get in the indexser using (foobar, jd).

Is this clearer?

Revision history for this message
gordon chung (chungg) wrote :

sure. as long as it's not on server side i think we're good :)

Revision history for this message
gordon chung (chungg) wrote :

oh, one question. if it's server side, what if i actually put:

gnocchi resource show foobar vs gnocchi resource show <actual primary key id>?

Revision history for this message
Julien Danjou (jdanjou) wrote :

Well gnocchiclient will compute uuid5(foobar, gordc) and that'll return the UUID primary key.

And if you go on HTTP and ask /v1/resource/generic/foobar as X-User-Id gordc, the server will do the same computing.

and /v1/resource/generic/<uuid of foobar, gordc> will also return the same thing since it's the pkey.

Clearer? :)

Revision history for this message
gordon chung (chungg) wrote :

sure. but how does it know that foobar in /v1/resource/generic/foobar is something that needs to be translated or is the actual primary key?

i was hoping that via REST foobar in /v1/resource/generic/foobar is always primary uuid

if you use client gnocchi resource show foobar, it will assume uuid first and if nothing, try uuid5(foobar, project-id) also, do we want it project scoped or user scoped? if we make it user scoped doesn't that mean i can't get everything from my project even if i'm admin?

Revision history for this message
Julien Danjou (jdanjou) wrote :

> "sure. but how does it know that foobar in /v1/resource/generic/foobar is something that needs to be translated or is the actual primary key?"

This is already in the API. It checks if foobar is an uuid or not. If it's not it converts it.

The UUID translation is not user nor project scoped, it's "creator" scoped (see the recent patch branch). Which in the Keystone case is user_id:project_id.
So if you don't know user/project (but you can know it by listing the resources) you can't guess the uuid of a named resource, sure.

> "if you use client gnocchi resource show foobar, it will assume uuid first and if nothing, try uuid5(foobar, project-id) also, do we want it project scoped or user scoped? if we make it user scoped doesn't that mean i can't get everything from my project even if i'm admin?"

"foobar" is not an uuid so it will never try "foobar". Even if it does, the server will translate it to an uuid5(foobar, creator).

Same answer for the "scope", it's creator (which is user:project).

Revision history for this message
gordon chung (chungg) wrote :

"""The UUID translation is not user nor project scoped, it's "creator" scoped (see the recent patch branch). Which in the Keystone case is user_id:project_id.
So if you don't know user/project (but you can know it by listing the resources) you can't guess the uuid of a named resource, sure."""

iirc, the original reason we had this translation was because we had some cases where our 'resource_id' had a '/' in it which broke api. the more we expand that, it seems like the translation are very hidden unless you know about it... even now i'm not entirely sure what i'd get if i put use /v1/resource/generic/foobar. i feel like it should be simple from REST pov (if you ask for v1/resource/generic/<blah> you should get whatever resource has primary key <blah>.

""""foobar" is not an uuid so it will never try "foobar". Even if it does, the server will translate it to an uuid5(foobar, creator).

Same answer for the "scope", it's creator (which is user:project)."""

so the scope is what my original concern was. this seems to break (my) use case of i am admin user and i want to grab everything from my project.

Revision history for this message
Julien Danjou (jdanjou) wrote :

"iirc, the original reason we had this translation was because we had some cases where our 'resource_id' had a '/' in it which broke api. the more we expand that, it seems like the translation are very hidden unless you know about it..."""

Resource IDs are and always have been UUID. This won't change.
In Ceilometer we realized we had resource ID which were not UUID (and some with / in it) so we added a translation layer str->uuid a while back, based on uuid5(id-that-is-not-an-uuid-and-that-we-call-original-resource-id).

"""even now i'm not entirely sure what i'd get if i put use /v1/resource/generic/foobar. i feel like it should be simple from REST pov (if you ask for v1/resource/generic/<blah> you should get whatever resource has primary key <blah>."""

If you use /v1/resource/generic/<blah>, you'll get <blah> (after the server or client translates it to an uuid5 _if_ it <blah> is not an UUID).

The problem this bug is about is that when <blah> is a string "blah" it's converted to a uuid5("blah") and that is always the same, whoever you are. That's cool, but you cannot have several resources with "blah" as "original-resource-id". Which is what the bug is about.

(or you think we should not fix the bug? :))

"""so the scope is what my original concern was. this seems to break (my) use case of i am admin user and i want to grab everything from my project."""

It actually does not. With Keystone auth, if you're admin, you can list all resources:

GET /v1/resources/instance

You'll get *all* instances whatever the "creator" field is – you're admin. That will return the data with all details about resources. If you want to access any resource directly, let's say one who has original_resource_id set to "foobar", well, if it's a "foobar" from jd its ID is currently uuid5(foobar). Which means that there's only one uuid5("foobar") in Gnocchi. But you can access it (you have both resource.id and resource.original_id with those info) since you're admin.

The proposal here, is to make "foobar" translated to uuid5("foobar", "jd") rather than just uuid5("foobar"). Then everything still works as just described, but this time you can have more than one resource whose original_resource_id is "foobar'. You can have one where its creator is sileht and its resource_id will be uuid5("foobar", sileht) which is a computed id, but is still unique. And original_resource_id will be still "foobar".

Revision history for this message
gordon chung (chungg) wrote :

"""
If you use /v1/resource/generic/<blah>, you'll get <blah> (after the server or client translates it to an uuid5 _if_ it <blah> is not an UUID).

The problem this bug is about is that when <blah> is a string "blah" it's converted to a uuid5("blah") and that is always the same, whoever you are. That's cool, but you cannot have several resources with "blah" as "original-resource-id". Which is what the bug is about.
"""

so is this still the same assumption then? if you provide a uuid, then it automatically becomes primary resource_id and therefore no 2 projects may pass in same uuid?

Revision history for this message
Julien Danjou (jdanjou) wrote :

No… I hope my patch will clear the misunderstanding. :)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to gnocchi (master)

Fix proposed to branch: master
Review: https://review.openstack.org/413017

Changed in gnocchi:
status: Triaged → In Progress
Revision history for this message
Jake Yip (waipengyip) wrote :

Hi Julien,

Thanks for the patch! I just reviewed it and have some thoughts about it.

> def ResourceUUID(value, creator):

Do you think it might be better to compute ResourceUUID(value, project_id) instead?

Most of OpenStack operates on the premise that resources belong to a project, not a user. A user can access the resources created by other users in the same project. E.g. a user I can do a `nova show <instance_name>` someone else created.

So, I think in gnocchi case it will be good to have the same behaviour.

Revision history for this message
Julien Danjou (jdanjou) wrote :

Hi Jake,

The user/project is really tied to the Keystone approach, and Gnocchi trying to be generic, it's not a good idea to bake those notion deeply into it.

That being said, it does not is a problem to user value+creator as the key here. This is just about encoding the resource id, so it is unique for a creator (user+project in Keystone case). This does not implies who gets access to the resource: the Keystone auth mode still allow access to resources that belongs to a project_id that the user is auth with.

So that means you won't be able to do directly "gnocchi resource show foobar" as user2/project1 if it has been created by user1/project1 because that will be a different UUID. But you will be able to do "gnocchi resource show <uuid-of-foobar-created-by-user1/project1>" as the Keystone auth mode will allow you. It's just a matter of knowing the UUID – which is easily possible by listing or search the resource.

The trade-off of fixing this bug is that: everyone can create a "foobar" resource (it's being translated into an UUID) but it's gonna be unique per creator (which is user+project in Gnocchi).

Revision history for this message
Jake Yip (waipengyip) wrote :

Hi Julien,

> So that means you won't be able to do directly "gnocchi resource show foobar" as user2/project1 if it has been created by user1/project1 because that will be a different UUID. But you will be able to do "gnocchi resource show <uuid-of-foobar-created-by-user1/project1>" as the Keystone auth mode will allow you.

Yes I understand this. I thought it might be a good idea to have the similar behaviour like nova where you can do a `nova show foobar` as any user in the project. Although nova does it another way, not by translating a name -> uuid.

But you might have a point about it to be generic. I'm fine either way, as long as it fixes the original bug.

By the way, do you have any experience on use cases that are doing create/show resources using strings instead of uuids? Perhaps with use cases we might figure which is better.

Revision history for this message
Julien Danjou (jdanjou) wrote :

Nova does that "show instance" by doing a *search*. Which can be very slow, especially in the case of Gnocchi where it has a lot more resources in its index. Here the name->uuid transformation is O(1), compared to O(log n) in Nova (at best if indexed, or O(n) otherwise).

But in the end you can still implement a "gnocchi search-and-show resource foobar" that will do the same way that Nova does.

On the uses cases, a lot of the non-OpenStack users have systems to monitor where resources have names (e.g. hostnames) but no UUID at all. So computing those UUID without having to do a search is a big win. :)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to gnocchi (master)

Reviewed: https://review.openstack.org/413017
Committed: https://git.openstack.org/cgit/openstack/gnocchi/commit/?id=ad4b851c7fbf4b914d1c2b6c55957d4315423326
Submitter: Jenkins
Branch: master

commit ad4b851c7fbf4b914d1c2b6c55957d4315423326
Author: Julien Danjou <email address hidden>
Date: Mon Dec 19 12:18:09 2016 +0100

    rest: string → UUID conversion for resource.id to be unique per user

    This changes the UUID5 based mechanism so it depends on the user trying
    to CRUD the resource. This makes sure that when using this kind of
    transformation, the resource id is converted to a unique id for the
    user, while preventing conflicting if every user wants to create a
    "foobar" resource.

    Change-Id: Iebaf3b9f8e0a198af0156008710e0c1253dc5f9d
    Closes-Bug: #1617918

Changed in gnocchi:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/gnocchi 3.1.0

This issue was fixed in the openstack/gnocchi 3.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.