Filter - aggregate flavour extra spec affinity filter

Registered by sean mooney on 2015-05-16

This blueprint introduces a new filter to the nova filter scheduler.
This blueprint does not modify or define any new flavour extra spec key/value
pairs. This blueprint will introduce a new flavor_extra_spec host aggregates
metadata key/value pair. The flavor_extra_spec metadata pair will be consumed
by the AggregateTypeExtraSpecsAffinityFilter to allow operators to define a set of
extra specs key value pairs that are required to schedule to the aggregate.

Problem description
===================

At present the filter_scheduler allows operators to associate an instance
type with a host aggregate via the AggregateTypeAffinityFilter [1][2] or to
enforce that an aggregate satisfies any extra specifications associated
with the instance type via the AggregateInstanceExtraSpecsFilter[3], but
does not allow an operator to enforce that instances will only be places
in the aggregate if all of the extra specs required by the aggregate
are requested by the instance.

The current avaiable filters do not allow operators to reserve capacity for
needy VMs that have specific hardware or software requirement.

This blueprint introduces a new filter to address this gap.

Use Cases
----------

hw:mem_page_size is used in the use case below but this is equally applicable
to any extra spec keypair.

1) An operator has many flavours which define multiple tiers of memory backings
as well as other extra specs. The operator wants to optimise the scheduler by
first filtering on host aggregate flavor_extra_spec and then filtering the
remaining host with the NUMATopologyFilter.

this use case can be supported by placing the
AggregateTypeExtraSpecsAffinityFilter before the NUMATopologyFilter in the
scheduler_default_filters and defining appropriate host aggregates e.g.

standard memory backing aggregate:
flavor_extra_spec: "hw:mem_page_size=small,hw:mem_page_size=any"

high bandwidth memory backing aggregate:
flavor_extra_spec: "hw:mem_page_size=large"

2) An operator has multiple page sizes allocated on each node.
They have deployed open stack in a ha environment and co located some
OpenStack services with nova compute to evenly distribute network storage
and CPU load across there could infrastructure. To optimise this deployment
the operator wants to define host aggregate reflect how the reflect how
services are co-located and to limit what resources can be
requested when scheduling.

NFV aggregate:
flavor_extra_spec: "hw:mem_page_size=2M"

hosts in this aggregate are configured with 4k,2M and 1G hugepages.
each node is deployed with an nfv optimised vswitch that utilises 1G hugepages
for maximum throughput. The nova vcpu_pin_set and kernel isolcpus parameters
are used to isolate instances to cores not used by co-located services.
The nova-compute service is co-located with the neutron l3 agent with dvr
enabled to further optimise tenant networking in this aggregate.

With the flavor_extra_spec definition above only vm that request 2M
hugepages will be scheduled to this node. The 1G hugepages will be reserved
for vswitch and the standard 4k pages will be reserved for openstack
and host os services.

API aggregate:
flavor_extra_spec: "hw:mem_page_size=small,hw:mem_page_size=any"

hosts in this aggregate are configured with standard 4k pages and 1G hugepages.
the nova-compute service is co-located with RabbitMQ, MySQL and OpenStack
api endpoints. API endpoints, RabbitMQ and MySQL are co-located to reduce api
latency. To optimise database performance MySQL is configured to use
hugepages[4]. The nova vcpu_pin_set and kernel isolcpus parameters
are used to isolate instances to cores not used by co-located services.

With the flavor_extra_spec definition above only vms which request small
or any memory backing are scheduled to this aggregate.

Blueprint information

Status:
Complete
Approver:
John Garbutt
Priority:
Low
Drafter:
sean mooney
Direction:
Needs approval
Assignee:
sean mooney
Definition:
Obsolete
Series goal:
None
Implementation:
Beta Available
Milestone target:
None
Started by
John Garbutt on 2015-06-22
Completed by
sean mooney on 2019-10-09

Related branches

Sprints

Whiteboard

implemented in https://github.com/openstack/nfv-filters

https://review.openstack.org/#/c/183876/

Gerrit topic: https://review.openstack.org/#q,topic:bp/aggregate-extra-specs-filter,n,z

Addressed by: https://review.openstack.org/189279
    Added new scheduler filter: AggregateTypeExtraSpecsAffinityFilter

Please note, this blueprint still needs to be approved for liberty, even if it doesn’t need a spec. That is done by adding it into the weekly meeting agenda to the list of spec-less blueprint that need approving. --johnthetubaguy 22nd June 2015

Sorry, we have now hit the non-priority feature freeze for Liberty. You will need to resubmit this blueprint for Mitaka or apply for an exception. For more details on why this is happening, and the rest of the process details, please see: https://wiki.openstack.org/wiki/Nova/Liberty_Release_Schedule
--johnthetubaugy 4th August 2015

Addressed by: https://review.openstack.org/274725
    Revert "Added new scheduler filter: AggregateTypeExtraSpecsAffinityFilter"

Addressed by: https://review.openstack.org/275128
    Revert "Revert "Added new scheduler filter: AggregateTypeExtraSpecsAffinityFilter""

Sorry, we have now hit the Non-Priority Feature Freeze for Mitaka. For more details please see: http://docs.openstack.org/releases/schedules/mitaka.html#m-nova-npff and http://docs.openstack.org/developer/nova/process.html#non-priority-feature-freeze
--johnthetubaguy 2016.02.11

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.