Cyborg Agent

Registered by Justin Kilpatrick

The Cyborg agent will reside on compute hosts and potentially other hosts that may make use of accelerators.

Agent responsibilities:
- Inspect hardware to locate accelerators
- Manage installing drivers, dependencies and other setup and teardown
- Manage connecting the instance to the accelerator once it has spawned
- Report data about available accelerators, status, and utilization to the Cyborg server

Hardware Discovery:
The instance is scanned for accelerators and usage levels of existing accelerators every few seconds and this information is reported in a heartbeat message to the Cyborg server to help manage scheduling and availability.

Hardware Management:
Ansible will be used to manage configuration files and other setup for each accelerator and it's driver. Setup and teardown playbooks will be made for each set of supported hardware. A configuration change on cyborg managed hardware will boil down to running the uninstall playbook and the install playbook with different configuration options.

Instance connection:
Once a instance is spawned that requires connecting to a specific accelerator on the host Cyborg server will send a message to Cyborg agent to inform the agent of the new instance. Since the connection method may change dramatically between different accelerators the driver should probably provide a connect function to call out to.

Blueprint information

Status:
Not started
Approver:
None
Priority:
Essential
Drafter:
Justin Kilpatrick
Direction:
Needs approval
Assignee:
Justin Kilpatrick
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.