Allow drivers to expose API functionality without a Node UUID

Registered by Russell Haering

Ironic allows drivers to expose custom API functionality on per-node basis with a POST to URLs under /nodes/<node_uuid>/vendor_passthru/. In addition to exposing vendor-specific functionality to consumers, this interface is used to support message passing from deploy agents to drivers running in the conductor. However there is no mechanism through which a driver may expose top-level functionality which is not specific to a Node. This functionality is needed to enable agent functionality where the agent doesn't know the ID of the node, such as automated hardware registration and lookup of node IDs.

In order to enable these use cases, Ironic should allow drivers to expose top-level functionality by allowing POSTs to /drivers/<driver_name>/vendor_passthru/<method_name>.

This mechanism proposed in this blueprint is explicitly intended only to enable message passing from a deploy agent to a driver. This mechanism _could_ be leveraged by drivers to expose top-level consumer facing functionality, such changes are explicitly not approved by this blueprint - we should require a separate discussion before accepting such a change.

Upon receiving a "driver vendor passthru" request, the API should randomly select a conductor with support for the specified driver, and call a "driver_vendor_passthru" RPC method on it, passing the name of the driver, the "method_name" path segment from the URL, and a parsed JSON document supplied by the user to the call.

Upon receipt of a "driver_vendor_passthru" RPC call, the conductor should load the specified driver, and call "driver_vendor_passthru" with the specified method and parsed JSON document on its vendor interface. If the driver does not specify a vendor interface, an UnsupportedDriverExtension exception should be raised.

In order to not break existing drivers, the VendorInterface class should be extended with a default "driver_vendor_passthru" method which raises an UnsupportedDriverExtension. This method should be overridded on the MixinVendorInterface class to select from a second driver vendor passthru mapping which can be optionally passed to the constructor.

In contrast to the the existing "vendor_passthru" interface, "driver_vendor_passthru" frequently needs to return data to the caller (for example, in the case of an agent attempting to look up what node it belongs to). API calls therefore must block until the call returns.

Blueprint information

Status:
Complete
Approver:
aeva black
Priority:
High
Drafter:
Russell Haering
Direction:
Approved
Assignee:
None
Definition:
Review
Series goal:
Accepted for juno
Implementation:
Implemented
Milestone target:
None
Started by
aeva black
Completed by
aeva black

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/nodeless-vendor-passthru,n,z

Addressed by: https://review.openstack.org/81919
    Drivers may expose a top-level passthru API

---------------------

This blueprint seems to assume that all operations will take less time than the HTTP and RPC timeouts, since it does not address any other mechanism for drivers to provide a response. This does seem inline with the statement that this blueprint only addresses agent-calback mechanisms, and is not intended to disclose any user-facing functionality.

Can you take a guess at how a user-facing driver API could be exposed, which does not hinge upon this requirement? I would like to be sure your proposed implementation does not prevent a user-facing driver API.

--Devananda

---------------------

Right, driver vendor passthru methods should generally not do anything that could reasonably be expected to take a long time. A few database calls should be OK, but making a ton of IPMI calls likely wouldn't be.

A driver could fork off anything expected to take a long time to run in the background, with a few limitations:

1. They'll need a plan for if the conductor dies after they return to the caller, but before the work completes

2. If they need to return any sort of results to the caller, they'll need to either use a callback mechanism, or have a place to store those results.

Short of building a generic job execution mechanism, these are probably questions that would need to be worked out on case by case basis.

User-facing vendor APIs should just be able to go ahead and use this mechanism, although most of the ideas I've heard for those sorts of APIs seem like they would be better handled by a re-thought concept of a "chassis". For example, if a cabinet needed some sort of active discovery in order to online, and this discovery was expected to take more than a few seconds, one could:

1. Implement this as a driver vendor passthru method, but fork off work to happen in the background. This isn't great, because unlike vendor passthru on a node, there isn't a reasonable place to store state. It doesn't make sense to store it on the driver, for example.

2. Implement this using a new Chassis-oriented interface. For example, create a Chassis with appropriate discovery parameters, and a conductor will periodically poll the Chassis for changes. This gives you a reasonable location to store state (on the chassis), as well as ensuring continued execution should something fail.

3. Add a new generic job execution mechanism. I don't really think this is a good idea.

In conclusion, I don't think this should ever get in the way of something consumer-facing, even if it required long running operations. It just might not be the best spot for it.

- Russell

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.