KVM host maintenance support

Registered by Oshrit Feder on 2013-01-23

Sometimes a sysadmin would like to reboot a Nova's compute node for maintenance operations due to several possible reasons such as hardware upgrade, patches installations, etc.

In contrast to unexpected node failures which may take the host down and result in VMs down time, when a maintenance need arises, it is important to plan it ahead carefully (i.e. prohibiting future new VMs deploys on the node as well as evacuating the already existing VMs to other compute nodes), in order to minimize the effect on the users.

Nova exposes a set_host_maintenance API (host_update –maintenance enable/disable); the current implementation targets the request directly to the compute node to be put in maintenance, and so the compute node itself is responsible of orchestrating the possibly complicated process which requires finding new targets to the existing VMs. Finding an appropriate target for a VM is not a foreign task to Nova - somewhat resemble operations (run_instance and resize operations) exist. Both of those operations are not directed to the compute node first, but rather orchestrated by nova-scheduler which in turn directs the request to the relevant compute nodes to perform a single step such as provision on the actual host.
With the current implementation, only hypervisors that are themselves capable of performing re-scheduling of a VM, can be supported in the host_maintenance feature, e.g. XEN with VM.pool_migrate. Other hypervisor, such as KVM, are not, and trying to invoke the host_maintenance API will result in an error.
In addition, it may be desired that all policies and constrains that were enforced during the first placement of a VM (e.g. run_instance), will be considered again in practice when performing re-scheduling of a VM, such as in the case of evacuating VMs for put in maintenance need.

As a first step, we propose to direct the maintenance requests to nova-scheduler. For backwards compatibility, the default implementation for now will remain almost as-is and simply will send the requests to nova-compute. Next, a second patch with a more sophisticated scheduler driver implementation will process the maintenance request to create host evacuation plan and migrate the affected VMs to other compute nodes. In this way various hypervisors, among them KVM, will be able to be put in maintenance as well.

Blueprint information

Status:
Started
Approver:
Vish Ishaya
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
Oshrit Feder
Definition:
Drafting
Series goal:
None
Implementation:
Started
Milestone target:
None
Started by
Oshrit Feder on 2013-01-23

Related branches

Sprints

Whiteboard

Related blueprint: https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate

Gerrit topic: https://review.openstack.org/#/c/19636/

Gerrit topic: https://review.openstack.org/#q,topic:bp/host-maintenance,n,z

Addressed by: https://review.openstack.org/19636
    Change Set_host_maintenance to scheduler enabled

Addressed by: https://review.openstack.org/22428
    Set host enabled
Related blueprint: https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate

Gerrit topic: https://review.openstack.org/#/c/19636/

Gerrit topic: https://review.openstack.org/#q,topic:bp/host-maintenance,n,z

Addressed by: https://review.openstack.org/19636
    Change Set_host_maintenance to scheduler enabled

Addressed by: https://review.openstack.org/22428
    Set host enabled

Marking this blueprint as definition: Drafting. If you are still working on this, please re-submit via nova-specs. If not, please mark as obsolete, and add a quick comment to describe why. --johnthetubaguy (20th April 2014)Related blueprint: https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate

Gerrit topic: https://review.openstack.org/#/c/19636/

Gerrit topic: https://review.openstack.org/#q,topic:bp/host-maintenance,n,z

Addressed by: https://review.openstack.org/19636
    Change Set_host_maintenance to scheduler enabled

Addressed by: https://review.openstack.org/22428
    Set host enabled
Related blueprint: https://blueprints.launchpad.net/nova/+spec/unify-migrate-and-live-migrate

Gerrit topic: https://review.openstack.org/#/c/19636/

Gerrit topic: https://review.openstack.org/#q,topic:bp/host-maintenance,n,z

Addressed by: https://review.openstack.org/19636
    Change Set_host_maintenance to scheduler enabled

Addressed by: https://review.openstack.org/22428
    Set host enabled

Marking this blueprint as definition: Drafting. If you are still working on this, please re-submit via nova-specs. If not, please mark as obsolete, and add a quick comment to describe why. --johnthetubaguy (20th April 2014)

Marking this blueprint as definition: Drafting. If you are still working on this, please re-submit via nova-specs. If not, please mark as obsolete, and add a quick comment to describe why. --johnthetubaguy (20th April 2014)

(?)

Work Items