VM Ensembles

Registered by Gary Kotton on 2013-01-14

This document introduces the concept of a VM ensemble or VM group into Nova. An ensemble will provide the tenant the ability to group together VMs that provide a certain service or part of the same application. More specifically it enables configuring scheduling policies per group. This will in turn allow for a more robust and resilient service. Specifically, it will allow a tenant to deploy a multi-VM application that is designed for VM fault tolerance in a way that application availability is actually resilient to physical host failure.

https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4UTwsmhw/edit

Blueprint information

Status:
Started
Approver:
Vish Ishaya
Priority:
Low
Drafter:
Gary Kotton
Direction:
Needs approval
Assignee:
Gary Kotton
Definition:
Drafting
Series goal:
None
Implementation:
Good progress
Milestone target:
None
Started by
Gary Kotton on 2013-01-22

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/vm-ensembles,n,z

Addressed by: https://review.openstack.org/19577
    Extract validation and provision code to separate method

Addressed by: https://review.openstack.org/19906
    Enable passing a list of instances to the scheduler.
    Abandoned.

Would it be possible to provide a tl;dr on how this interacts with multiple instances being created in a single request_spec? It just seems a bit strange to me to have a list of request_specs, which themselves have a list of instances to create. Is there some reason these two concepts of multiple VMs can't be merged?

Sounds like there is still some discussion about what exactly needs to go into nova to make this work, but hopefully we can deal with that on the ML/in review.

We keep on getting questions regarding the need to schedule the entire VM ensembles as a group, so we realized it requires some additional explanation:

As the cloud compute capacity allocation percentage is higher, more likely we will suffer from false admission control decisions. Unless, we schedule the entire VM ensemble as a group. Cloud service providers are striving to maximize compute capacity percentage allocated to cloud tenants. For example, AWS is offering a special EC2 Spot Instances (http://aws.amazon.com/ec2/spot-instances/) to utilize any available capacity unit, such that capacity allocation will be maximized. Please note that even when capacity allocation on a given host is maximized, it does not mean that CPU utilization is maximized, since not all VMs on that host are likely to utilize their vCPU quota at the maximum level at the same time. Therefore, it is safe to drive capacity allocation percentage to the 90%-100% level.

Conclusion: The Nova scheduler will need to schedule VM ensembles as a group.

Addressed by: https://review.openstack.org/21070
    Support for scheduler hints for VM groups
    (But this still schedules VMs one at a time, right?)

Unapproved - please re-submit via nova-spec --johnthetubagy (20th March 2014)Gerrit topic: https://review.openstack.org/#q,topic:bp/vm-ensembles,n,z

Addressed by: https://review.openstack.org/19577
    Extract validation and provision code to separate method

Addressed by: https://review.openstack.org/19906
    Enable passing a list of instances to the scheduler.
    Abandoned.

Would it be possible to provide a tl;dr on how this interacts with multiple instances being created in a single request_spec? It just seems a bit strange to me to have a list of request_specs, which themselves have a list of instances to create. Is there some reason these two concepts of multiple VMs can't be merged?

Sounds like there is still some discussion about what exactly needs to go into nova to make this work, but hopefully we can deal with that on the ML/in review.

We keep on getting questions regarding the need to schedule the entire VM ensembles as a group, so we realized it requires some additional explanation:

As the cloud compute capacity allocation percentage is higher, more likely we will suffer from false admission control decisions. Unless, we schedule the entire VM ensemble as a group. Cloud service providers are striving to maximize compute capacity percentage allocated to cloud tenants. For example, AWS is offering a special EC2 Spot Instances (http://aws.amazon.com/ec2/spot-instances/) to utilize any available capacity unit, such that capacity allocation will be maximized. Please note that even when capacity allocation on a given host is maximized, it does not mean that CPU utilization is maximized, since not all VMs on that host are likely to utilize their vCPU quota at the maximum level at the same time. Therefore, it is safe to drive capacity allocation percentage to the 90%-100% level.

Conclusion: The Nova scheduler will need to schedule VM ensembles as a group.

Addressed by: https://review.openstack.org/21070
    Support for scheduler hints for VM groups
    (But this still schedules VMs one at a time, right?)

Unapproved - please re-submit via nova-spec --johnthetubagy (20th March 2014)

Marking this blueprint as definition: Drafting. If you are still working on this, please re-submit via nova-specs. If not, please mark as obsolete, and add a quick comment to describe why. --johnthetubaguy (20th April 2014)

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.