All Conductors cache the same files for serving TFTP

Registered by aeva black

Instead of using a shared file system for HA of the PXE boot environment, let's have all conductors build identical TFTP directories, eliminating any SPoF.

Goals:
- any conductor (which has the PXE driver) can service any PXE BOOT request from any node.
- avoid any fixed association of conductor<->instance. We don't have this today, and we shouldn't add it.
- new conductors joining the cluster can rebuild all necessary TFTP data.

To do this, some things are needed:

* When isinstance(node.driver.deploy, pxe.PXEDeploy), and node.driver_info contains a deploy kernel & ramdisk, all conductors must pre-cache those kernel & ramdisk images.
* When an instance is created by Nova, the user kernel & ramdisk glance image IDs must be stored in node.properties, and pre-cached by all conductors.

Both of the above require an RPC broadcast to notify existing conductor services to cache these images before the instance can be spawned. There also must be a periodic_task to maintain the cache directories and allow new conductor services to join the cluster. Cache clean up should be handled by the periodic_task as well.

* The user image -- which may be very large -- does not need to be pre-cached anywhere. Instead, it should be fetched from Glance on-demand during the deploy phase, eg. within pxe.VendorPassthru:_continue_deploy()

* A new RPC broadcast will need to be created to notify all conductors to change the TFTP "default XXX" config option when a node is deployed. Conductors must loosely synchronize their invocation of deploy_utils.switch_pxe_config().

* For Neutron integration, eg. nova.virt.driver:dhcp_options_for_instance(), the IP address of any conductor instance which is capable of servicing the relevant node may be used. This can be determined on the fly, by the ironic-api service, by querying the ironic.conductors table to find a conductor which is capable of servicing that node's driver. The ironic.conductors table should have a new column added to store the IP address of each conductor.

Blueprint information

Status:
Complete
Approver:
aeva black
Priority:
Undefined
Drafter:
aeva black
Direction:
Needs approval
Assignee:
None
Definition:
Superseded
Series goal:
Accepted for icehouse
Implementation:
Unknown
Milestone target:
None
Completed by
aeva black

Related branches

Sprints

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.