Cyborg Agent
The Cyborg agent will reside on compute hosts and potentially other hosts that may make use of accelerators.
Agent responsibilities:
- Inspect hardware to locate accelerators
- Manage installing drivers, dependencies and other setup and teardown
- Manage connecting the instance to the accelerator once it has spawned
- Report data about available accelerators, status, and utilization to the Cyborg server
Hardware Discovery:
The instance is scanned for accelerators and usage levels of existing accelerators every few seconds and this information is reported in a heartbeat message to the Cyborg server to help manage scheduling and availability.
Hardware Management:
Ansible will be used to manage configuration files and other setup for each accelerator and it's driver. Setup and teardown playbooks will be made for each set of supported hardware. A configuration change on cyborg managed hardware will boil down to running the uninstall playbook and the install playbook with different configuration options.
Instance connection:
Once a instance is spawned that requires connecting to a specific accelerator on the host Cyborg server will send a message to Cyborg agent to inform the agent of the new instance. Since the connection method may change dramatically between different accelerators the driver should probably provide a connect function to call out to.
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Essential
- Drafter:
- Justin Kilpatrick
- Direction:
- Needs approval
- Assignee:
- Justin Kilpatrick
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by