OpenStack Compute (nova)

NUMATopologyFilter doesn't account for CPU/RAM overcommit

Bug #1484742 reported by Chris Friesen on 2015-08-14

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Invalid	Undecided	Chris Friesen

Bug Description

There seems to be a bug in the NUMATopologyFilter where it doesn't properly account for cpu_allocation_ratio or ram_allocation_ratio. (Detected on stable/kilo, not sure if it applies to current master.)

To reproduce:

1) Create a flavor with a moderate number of CPUs (5, for example) and enable hugepages by setting "hw:mem_page_size=2048" in the flavor extra specs. Do not specify dedicated CPUs on the flavor.

2) Ensure that the available compute nodes have fewer CPUs free than the number of CPUs in the flavor above.

3) Ensure that the "cpu_allocation_ratio" is big enough that "num_free_cpus * cpu_allocation_ratio" is more than the number of CPUs in the flavor above.

4) Enable the NUMATopologyFilter for the nova filter scheduler.

5) Try to boot an instance with the specified flavor.

This should pass, because we're not using dedicated CPUs and so the "cpu_allocation_ratio" should apply. However, the NUMATopologyFilter returns 0 hosts.

It seems like the NUMATopologyFilter is failing to properly account for the cpu_allocation_ratio when checking whether an instance can fit onto a given host.

See original description

Tags:

Chris Friesen (cbf123) on 2015-08-14

description:

updated

Chris Friesen (cbf123) on 2015-08-14

summary:	- NUMATopologyFilter doesn't account for cpu_allocation_ratio + NUMATopologyFilter doesn't account for CPU/RAM overcommit
description:	updated

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-08-14: Fix proposed to nova (master)

#1

Fix proposed to branch: master
Review: https://review.openstack.org/213268

Changed in nova:
assignee:	nobody → Chris Friesen (cbf123)
status:	New → In Progress

Revision history for this message

Nikola Đipanov (ndipanov) wrote on 2015-08-15:

#2

As commented on the code review:

The idea was that it's OK to have overcommit, but an instance larger than a NUMA node should _never_ land on that NUMA noda, as it would effectively be overcommiting against itself.

This is not how overcommit on host level works - but it should probably get fixed there as it is questionable whether overcommitting an instance against itself makes sense. So maybe we want to have a new bug for that and close this one?

If you are seeing the opposite, that the instance is not larger than the whole of NUMA node itself. but still won't get considered for CPU overcommit with non-pinned NUMA requested - than that's a different bug and your patch won't fix it and we should investigate more.

Revision history for this message

Chris Friesen (cbf123) wrote on 2015-08-17:

#3

I've been testing the case where a single instance is larger than the number of host logical CPUs, so that would fit with your explanation. I can see why one might chose to implement that, though as you say it's not great to have different overcommit behaviour depending on whether or not the NUMA filter is involved.

I may do as you suggest and open up a separate bug specifically addressing the behaviour difference.

Chris Friesen (cbf123) on 2015-08-17

Changed in nova:
status:	In Progress → Invalid

Revision history for this message

Chris Friesen (cbf123) wrote on 2015-08-17:

#4

Closing as "invalid" based on Nikola's comments above. Bug 1485631 has been opened to unify the logic between the two cases.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-08-17: Change abandoned on nova (master)

#5

Change abandoned by Chris Friesen (<email address hidden>) on branch: master
Review: https://review.openstack.org/213268
Reason: Abandoning change based on Nikola's comments. Bug 1485631 has been opened to unify the logic between the NUMA-topology case and the no-NUMA-topoology case.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.