Baremetal nodes can be migrated among compute hosts

Registered by aeva black on 2013-04-22

In a baremetal cloud with multiple nova-compute hosts, each nova-compute host is a SPoF for the baremetal nodes which it manages. There is currently no mechanism to move a node from one compute-host to another compute-host, either manually or automatically; doing so requires deleting the node and adding it again, which will invalidate any instance currently deployed to that node.

It is also worth pointing out that, if a nova-compute host goes offline, Nova is not able to control the baremetal nodes managed by that host, though any existing instances should continue to function as long as they do not restart.

Moving a node to another compute host could be accomplished by:
- adding a new bm state "migrating"
- adding a method to rebuild the tftp environment for a deployed instance on a new compute host.
- finding a means to update nova scheduler such that the (host, hypervisor_hostname) can change. This would need to be possible regardless of whether an instance was active on that compute node.

Additionally, by tracking the status in the nova_bm database, for each node, of the compute host which owns it, other compute hosts could "take over" for a dead host. This would require the following changes:
- add a timestamp column to bm_nodes table
- compute host periodic task that updates the timestamp
- compute host periodic task that looks for bm_nodes whose compute host has not checked in, and initiates take-over, with a distributed (iow, db-managed) lock on that node, compute_host, and instance.

This was discussed during Havana summit here:
  https://etherpad.openstack.org/HavanaBaremetalNextSteps

Blueprint information

Status:
Complete
Approver:
Russell Bryant
Priority:
Undefined
Drafter:
None
Direction:
Approved
Assignee:
aeva black
Definition:
Obsolete
Series goal:
None
Implementation:
Not started
Milestone target:
None
Completed by
John Garbutt on 2014-03-05

Related branches

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.