Task & flow attributes

Registered by Joshua Harlow

This blueprint has been superseded. See the newer blueprint "Attributes of backend activities" for updated plans.

In order to make it easy to construct flows out of a given set of tasks it would be nice to have a taskflow provided 'registry' (that can be searched) for tasks/flows matching certain attributes. This will allow those tasks/flow to be combined into a larger flow by an external entity without the external entity having to know exactly what those tasks/flows are (or how to compose/create them). This is useful when a flow (or set of composed flows) are being created out of arbitary driver provided tasks by an entity (a manager) who is unaware of the internals of that driver.

For example:

Compute manager would ask this registry for tasks/flows which implement the 'XYZ' attribute (or set of attribues) and the registry can either throw 'NotFound' or it can give back the tasks/flows that have been previously registered to provide such an attribute. Then the compute manager can integrate those tasks/flows into its own flow (which may involve other tasks/flows with other attributes) to form an 'action' (such as create_instance or create_snapshot). This decoupling allows for the compute manager to be 'unaware' of how those tasks/flows are created and only care about the protocol (the set of attributes the manager requests to be implemented).

Blueprint information

Status:
Complete
Approver:
Caitlin Bestler
Priority:
Undefined
Drafter:
Joshua Harlow
Direction:
Needs approval
Assignee:
None
Definition:
Superseded
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
Joshua Harlow

Related branches

Sprints

Whiteboard

Caitlin:

This would not be the optimum interface from the backend perspective, but if makes more sense from the front end it is workable.

Let me take the primary example, replicating content of a Cinder volume or snapshot to another backend or to an Object Server backup.

Backends can differ here on the following fronts:
* Some backends can do these operations in a Stateless fashion. That is, the operation does not
   require the state of the Volume itself to be changed. This means that they can make a snapshot,
   or some equivalent, and perform the operation on that copy before deleting it.
* Some backends are at wire length from the cinder volume driver, others have the cinder volume
   driver running on the same machine.

The current Cinder backup logic normally presumes that a backup (or similar) option is only safe
when the Volume is quiescent. There is an override option, which will allow the operation even while
the Volume is active. Exercising that option makes sense when the backend is Stateless (as described above). But right now we are relying on the user to determine when it is safe to rely on stateless operation. There is at most a mild correlation between the backup capabilities and how
the user will set this flag. There are bound to be many false negatives and false positives.

The current Cinder implementation also pulls the Volume content to the machine where the Cinder Volume is running. it is then compressed and pushed to the Object Server. That is the correct strategy only when the Cinder volume is co-located with the Cinder Volume Driver. When the Cinder volume is separate, you want to tell the Cinder target to put the volume to the Object Server (optionally compressing it it can).

Each of these options can be represented by the API that Joshua proposes. But it feels unnatural because it loses the very relevant implementation that these are *options* and each backend is supposed to offer at least one method of performing this task.

The API as proposed also encourages each backend vendor to fully implement each service. I would prefer each backend to only implement very specific pieces and rely on common code to express common algorithms. Even if the common logic is only common to one third of the backends, that is still a lot of code savings. End users will benefit as enhancements and bug fixes only have to be done once, not once per vendor.

JH:

So what do u think would be the right way to do this in taskflow without making taskflow cinder specific. I was thinking that a generic 'capabilities' approach would allow for cinder drivers to declare what they are capable of, then the cinder 'manager' would be the one responsible for querying those capabilities and forming the overall 'flow' that would do the 'backup'. To me it would seem like taskflow would offer a type of registry/capability lookup 'thing' and it would be up to cinder to use that 'thing' to determine what to do.

For example:

1. Manager queries taskflow registry for capabilities like ['fast-backup', 'slow-backup', 'live-backup']
2. Manager then forms the corresponding flow for the capabilities of the driver that is active/responded/found by inserting that drivers registered 'tasks' into the overall 'backup' flow. This allows for the manager to implement the 'larger' backup operation while letting the cinder driver to influence how the cinder 'manager' does the larger operation.
3. Manager runs larger operation (incorporating tasks from driver).
4. Profit.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.