Nova Cyborg Interaction

Registered by Sundar Nadathur on 2018-09-20

Describes the Nova - Cyborg interaction needed to create and manage instances with accelerators, and the changes needed in Nova to accomplish that.

Blueprint information

Status:
Complete
Approver:
Matt Riedemann
Priority:
High
Drafter:
Sundar Nadathur
Direction:
Approved
Assignee:
Sundar Nadathur
Definition:
Approved
Series goal:
Accepted for ussuri
Implementation:
Implemented
Milestone target:
milestone icon ussuri-3
Started by
Matt Riedemann on 2019-06-20
Completed by
Balazs Gibizer on 2020-04-15

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/nova-cyborg-interaction,n,z

Addressed by: https://review.openstack.org/623026
    Add cyborg client to requirements

Addressed by: https://review.openstack.org/623027
    WIP: Cyborg PCI handling

Addressed by: https://review.openstack.org/631242
    WIP: Add utility function to get Cyborg client.

Addressed by: https://review.openstack.org/631243
    WIP: Add Cyborg device profile groups to spec obj.

Addressed by: https://review.openstack.org/631244
    WIP: Create and bind Cyborg ARQs.

Addressed by: https://review.openstack.org/631245
    WIP: Get resolved Cyborg ARQs and add PCI BDFs to VM's domain XML.

Addressed by: https://review.openstack.org/616239
    Calculate RequestGroup resource provider mapping

Addressed by: https://review.openstack.org/619528
    Fill the RequestGroup mapping during schedule

Gerrit topic: https://review.openstack.org/#/q/topic:bp/nova-cyborg-interaction

Gerrit topic: https://review.opendev.org/#/q/topic:bp/nova-cyborg-interaction

Addressed by: https://review.opendev.org/603955
    Nova Cyborg interaction specification.

Addressed by: https://review.opendev.org/631242
    ksa auth conf and client for cyborg access

Addressed by: https://review.opendev.org/631243
    WIP: Add Cyborg device profile groups to request spec.

Addressed by: https://review.opendev.org/631244
    WIP: Create and bind Cyborg ARQs.

Addressed by: https://review.opendev.org/631245
    WIP: Get resolved Cyborg ARQs and add PCI BDFs to VM's domain XML.

The blueprint is approved for the Train release. There are some implementation details to work out during code review that were discussed in the spec review but those can happen in the code review or mailing list, i.e. http://lists.openstack.org/pipermail/openstack-discuss/2019-June/006979.html. -- mriedem 20190620

Addressed by: https://review.opendev.org/673733
    Define new exceptions related to device profiles and ARQs.

Addressed by: https://review.opendev.org/673734
    Refactor some methods for reuse by Cyborg code.

Addressed by: https://review.opendev.org/673735
    Delete ARQs for an instance when the instance is deleted.

Addressed by: https://review.opendev.org/674726
    Block unsupported instance operations with accelerators.

[efried 20190905] Deferring to ussuri. This needs more time to bake against the cyborg side, which has only recently merged; and to get more review attention.

Addressed by: https://review.opendev.org/670999
    [WIP] add cyborg tempest job

Addressed by: https://review.opendev.org/682637
    Re-proposed Nova Cyborg interaction specification.

Addressed by: https://review.opendev.org/684151
    Updated Nova-Cyborg interaction spec.

[efried 20190927] Fast approved per http://specs.openstack.org/openstack/nova-specs/readme.html#previously-approved-specifications via https://review.opendev.org/682637
Note that there's a trivial update at https://review.opendev.org/684151 that's still open, but it doesn't represent any change in direction.

Gerrit topic: https://review.opendev.org/#/q/topic:nova-cyborg-interaction

Addressed by: https://review.opendev.org/692001
    Fix a bug in the sequence diagram.

Addressed by: https://review.opendev.org/692707
    Define Cyborg ARQ binding notification event.

Addressed by: https://review.opendev.org/694906
    Refactor to extract Placement helper functions for functional tests.

Addressed by: https://review.opendev.org/697940
    Enable hard reboot with accelerators.

Addressed by: https://review.opendev.org/698581
    Pass accelerator requests to each virt driver from compute manager.

Addressed by: https://review.opendev.org/699553
    Enable start/stop of instances with accelerators.

Addressed by: https://review.opendev.org/699554
    Enable and use COMPUTE_ACCELERATORS trait.

Addressed by: https://review.opendev.org/704227
    Bump compute rpcapi version and reduce Cyborg calls.

Addressed by: https://review.opendev.org/706083
    WIP: refactor: Do network & accel discovery near volumes

[efried 20200220] Agreed in the Nova meeting to Direction:Approve all Definition:Approved blueprints http://eavesdrop.openstack.org/meetings/nova/2020/nova.2020-02-20-14.00.log.html#l-131

Addressed by: https://review.opendev.org/710443
    [DNM] testing removal of cyborg client singleton

Addressed by: https://review.opendev.org/715326
    [WIP] cyborg evacuate support

Addressed by: https://review.opendev.org/716185
    Add release notes for Cyborg-Nova integration.

Addressed by: https://review.opendev.org/716186
    Delete ARQs by UUID if Cyborg ARQ bind fails.

[20200415] We are done with the Ussuri part of the feature. There will be a separate bp for Victoria.

Addressed by: https://review.opendev.org/744280
    [Trivial] Remove wrong format_message() conversion

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.