DOS : API_RESULT_LIMIT does not work for swift objects

Bug #1724598 reported by Yves-Gwenael Bourhis
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Dashboard (Horizon)
In Progress
Undecided
Unassigned
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned

Bug Description

A user can make the horizon apache process crash.

Indeed, API_RESULT_LIMIT does not work when `full_listing=False` is passed as argument to swiftclient.client.Connection.get_account or to swiftclient.client.Connection.get_container

Therefore When a customer has a very large amount of objects, the full production server crashes and stops responding.

To reproduce : slowly upload a million small objects on one container, then view this container : The server crashes.

Tags: security
Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :
Revision history for this message
Rob Cresswell (robcresswell-deactivatedaccount) wrote :

Is it intended that the result limit doesn't work when full_listing is passed? It's not entirely clear to me why there are 2 different parameters controlling API result quantity. Its seems like a valid bug, however.

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

NOTE: all active branches are affected.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete security advisory task has been added while the core security reviewers for the affected project or projects confirm the bug and discuss the scope of any vulnerability along with potential solutions.

Changed in ossa:
status: New → Incomplete
description: updated
Revision history for this message
Jeremy Stanley (fungi) wrote :

So, just to clarify the claimed exploit scenario: an authenticated normal user can upload an excessive number of objects into a Swift container, then request a full listing of that container through Horizon and effect a denial of service against Horizon itself even if the administrator of the deployment has intentionally set API_RESULT_LIMIT to mitigate such attacks?

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

@Jeremy Stanley Yes, exactly this scenario.

Even if the administrator of the deployment has intentionally set API_RESULT_LIMIT to mitigate such attacks, this limit does not work and an authenticated normal user can upload an excessive number of objects into a Swift container, and then crash the horizon server while viewing this container.

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

More info, here is the apache LOG when crash occurs:

    [Mon Oct 16 15:41:28.695638 2017] [core:error] [pid 21470:tid 139843967366912] [client 10.12.0.39:46851] End of script output before headers: horizon.wsgi, referer: https://<Sorry_if_I_hide_the_affected_domain>/project/containers/container/<Sorry_if_I_hide_the_customer_container_name>
    [Mon Oct 16 15:41:55.935525 2017] [:error] [pid 24160:tid 139844023068416] [remote 10.12.0.39:28931] mod_wsgi (pid=24160): Exception occurred processing WSGI script '/opt/console-cw/lib/python2.7/site-packages/openstack_dashboard/wsgi/horizon.wsgi'.
    [Mon Oct 16 15:41:55.937407 2017] [:error] [pid 24160:tid 139844023068416] [remote 10.12.0.39:28931] IOError: failed to write data

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

The crash lead to DOS and was produced by a legit customer who uploaded a million objects in a container.

We have multiple loadballanced horizon instances, but when the crash occurred, the customer hit "F5" and crashed all loadballanced nodes... And there was no more service (DOS).

The provided patch solved the issue.

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

We deploy Horizon from source "almost" this way (Updated a bit since):
https://dev.cloudwatt.com/en/blog/deploy-horizon-from-source-with-apache-and-ssl.html

So it's not a distro packaging issue, however packaged versions are theoretically impacted too.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks for the updated details. We're still waiting for Horizon security reviewers to confirm their bugtask, evaluate the provided patch in comment #1 or suggest an alternate fix, and determine which stable branches will require corresponding backports.

Revision history for this message
Rob Cresswell (robcresswell-deactivatedaccount) wrote :

The fix is valid; it seems like some overly ambitious caching on Horizons side. I think the patch will cause a regression in behaviour without any further work, however. Since security is more important, I think the only way forward would be to fix and add a release note documenting this, unless someone volunteers their time for further investigation.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Does the issue apply to all currently supported stable branches, and is the proposed patch backportable to them (what sort of regression are you anticipating)?

Revision history for this message
Rob Cresswell (robcresswell-deactivatedaccount) wrote :

Its valid for every active branch. My concern is that the Swift interface was written expecting all containers to be available for the client to page through; so without the full_listing parameter, it will only return the first 10000 and stop there.

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

This seems like an acceptable regression, a user having that many container shouldn't be using a web interface to manage them individually anyway right?

Yves-Gwenael, would you mind attaching a properly formatted patch as explained here: https://security.openstack.org/#how-to-propose-and-review-a-security-patch .

Revision history for this message
Yves-Gwenael Bourhis (yves-gwenael-bourhis) wrote :

Hello Tristan,

Here is the patch.

Revision history for this message
Tristan Cacqueray (tristan-cacqueray) wrote :

Thanks Yves-Gwenael,

@horizon-coresec, so is this an acceptable regression that could be backported down to stable/ocata?

Revision history for this message
Jeremy Stanley (fungi) wrote :

In keeping with recent OpenStack vulnerability management policy changes, no report should remain under private embargo for more than 90 days. Because this report predates the change in policy, the deadline for public disclosure is being set to 90 days from today. If the report is not resolved within the next 90 days, it will revert to our public workflow as of 2020-05-27. Please see http://lists.openstack.org/pipermail/openstack-discuss/2020-February/012721.html for further details.

description: updated
Revision history for this message
Jeremy Stanley (fungi) wrote :

It doesn't look like this report has seen any activity since my update two months ago, so consider this a friendly reminder:

The embargo for this report is due to expire one month from today, on May 27, and will be switched public on or shortly after that day if it is not already resolved sooner.

Thanks!

Jeremy Stanley (fungi)
description: updated
Revision history for this message
Jeremy Stanley (fungi) wrote :

The embargo for this report has expired and is now lifted, so it's acceptable to discuss further in public.

description: updated
information type: Private Security → Public Security
Revision history for this message
Gage Hugo (gagehugo) wrote :

Was the patch from Comment #15 an acceptable fix? If so I can push to gerrit for reviews.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to horizon (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/horizon/+/790504

Changed in horizon:
status: New → In Progress
Revision history for this message
Vishal Manchanda (vishalmanchanda) wrote :

Hi, I am unable to reproduce this bug in my devstack env. but I have purposed
a patch[1] as @Yves-Gwenael Bourhis suggested in comment 15.
let's discussed it in more detail while reviewing it.

[1] https://review.opendev.org/c/openstack/horizon/+/790504

Revision history for this message
Jeremy Stanley (fungi) wrote :

The lack of priority on this over the past 6 years seems to indicate it's not a severe enough risk to warrant a widely published advisory even if a fix ever does merge. The VMT and other OpenStack Security SIG members agreed during the 2023.1 cycle that this should be considered class B2 per our report taxonomy: https://security.openstack.org/vmt-process.html#report-taxonomy

Changed in ossa:
status: Incomplete → Won't Fix
information type: Public Security → Public
tags: added: security
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.