Network Aware Scheduler

Registered by Debo~ Dutta

This proposal is about leveraging network information to do better scheduling. We assume that network information is available as an API. Our proposed plugin uses this API to keep track of changes in the network state. When a VM needs to be placed, our scheduler plugin uses obtained network state to make provisioning decisions. It will make use of network constraints to place VMs to achieve different objectives.

Blueprint information

Status:
Complete
Approver:
Rick Clark
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
Obsolete
Series goal:
None
Implementation:
Deferred
Milestone target:
None
Completed by
Vish Ishaya

Related branches

Sprints

Whiteboard

Problem

The Openstack compute scheduler is network agnostic. It considers server ‘load’ while making scheduling decisions which helps assure the compute performance of a customer’s project or virtual machines. However, there are several applications like media streaming or communication intensive workloads like Hadoop, where the network performance of the compute project has to be assured. In order to make more intelligent scheduling decisions, in the aforementioned cases, the scheduler needs to consider information about the network.
Today, the scheduler is simple and picks a physical server to instantiate a VM without applying any network constraints. Now assume that Openstack scheduler has network visibility, i.e. an API to query the network for information like topology, available bandwidth on the links in the topology, locality or rack data etc., it could place VMs based on a joint optimization of any of those network metrics in addition to other constraints from the compute or storage side.

Possible solutions

In our proposal, we intend to write a scheduler plugin that has access to a network API. The scheduler plugin will use this API to keep track of changes in the network state in quasi-real time. The plugin can be designed to fetch state information using a regular poll based mechanism or a receive it automatically using a publish-subscribe mechanism.

When a VM needs to be placed, our scheduler plugin uses obtained network state to make provisioning decisions. It will make use of network constraints to place VMs to achieve different objectives. E.g.:
i) Pick a server if the available bandwidth on the uplink is >20%. Objective: Assure bandwidth for VM. Pick a VM if the available bandwidth on the uplink is >20%.
ii) Minimize the max link utilization of the network upon VM placement for a project. Objective: Load-balance the network.
iii) Place VMs that talk to each other on the same physical server or in another server in the same rack. Objective: Maximize performance/ Conserve bandwidth.

---
[Ravi Chunduru] In the above proposal there needs to be a provision to store the network characteristics like bandwidth, rack to network affinity etc., Are we fine with extraspecs for quantum network?
 Also there is a need to group a set of networks with same characteristics. Thus, scheduler knows to select one network out of that aggregate. Thoughts?

(?)

Work Items