Baremetal nodes can be migrated among compute hosts
In a baremetal cloud with multiple nova-compute hosts, each nova-compute host is a SPoF for the baremetal nodes which it manages. There is currently no mechanism to move a node from one compute-host to another compute-host, either manually or automatically; doing so requires deleting the node and adding it again, which will invalidate any instance currently deployed to that node.
It is also worth pointing out that, if a nova-compute host goes offline, Nova is not able to control the baremetal nodes managed by that host, though any existing instances should continue to function as long as they do not restart.
Moving a node to another compute host could be accomplished by:
- adding a new bm state "migrating"
- adding a method to rebuild the tftp environment for a deployed instance on a new compute host.
- finding a means to update nova scheduler such that the (host, hypervisor_
Additionally, by tracking the status in the nova_bm database, for each node, of the compute host which owns it, other compute hosts could "take over" for a dead host. This would require the following changes:
- add a timestamp column to bm_nodes table
- compute host periodic task that updates the timestamp
- compute host periodic task that looks for bm_nodes whose compute host has not checked in, and initiates take-over, with a distributed (iow, db-managed) lock on that node, compute_host, and instance.
This was discussed during Havana summit here:
https:/
Blueprint information
- Status:
- Complete
- Approver:
- Russell Bryant
- Priority:
- Undefined
- Drafter:
- None
- Direction:
- Approved
- Assignee:
- aeva black
- Definition:
- Obsolete
- Series goal:
- None
- Implementation:
- Not started
- Milestone target:
- None
- Started by
- Completed by
- John Garbutt