Asynchronous Container Operations
Currently all of our container operations are synchronous and block both the API and the conductor until the Docker client returns. This is not ideal, especially for potentially long running calls like container create where Docker/Swarm will block while the image is pulled onto the node.
Along with certain calls being changed to casts, there are a few other potential changes:
* Container status: the status of a container should be tracked through operations so users can see what's going on and so the api can reject certain calls until the container is ready
* Async faults: With asynchronous calls, exceptions that get raised will not bubble up through the API as calls won't be blocking. We will need a way to communicate failures to users that will work in an async model.
Blueprint information
- Status:
- Complete
- Approver:
- Adrian Otto
- Priority:
- High
- Drafter:
- Andrew Melton
- Direction:
- Approved
- Assignee:
- Surojit Pathak
- Definition:
- Obsolete
- Series goal:
- Accepted for newton
- Implementation:
- Needs Code Review
- Milestone target:
- None
- Started by
- Surojit Pathak
- Completed by
- Adrian Otto
Related branches
Related bugs
Sprints
Whiteboard
Implement non-blocking async casts at API level, asynchronous behavior by default in python-
<-------@suro-patz, 12-28-2015
Here is the summary of the proposed design, after discussion in mailing list [http://
1. Magnum-conductor would have a pool of green threads for executing the container operations, viz. executor_
2. Every time, Magnum-
How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon.
3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0]
4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-
The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance. To achieve this, we will use modulo-hashing based on <bay-id, container-id>, so that operations for a given container land up on same conductor-
This mechanism can be further refined to achieve more asynchronous behavior. [Phase2]
5. The hand-off between Mcon and a thread from executor_threadpool can be reflected through new states on the 'container' object, viz. create-in-progress, delete-
These states can be helpful to recover/audit, in case of Mcon restart or even in sync_bay_status. [Phase1]
@suro-patz, 12-28-2015 -------->
<-------@suro-patz, 1-21-2016
Few more updates, along the way of development -
- It is desirable, if we can keep the mode of async operation, controllable by a config knob. This will help get the code in, incrementally. Also, this will help others to try out and provide feedback. So to keep both the code path available, without any duplication, we will use futurist interface.
- For creating two classes of actions (sync/async), and to achieve https:/
- Futurist interface allows submission of task, even when threads are not available. And process them, as they become available. So, we will use this facility to absorb the burst of requests.
@suro-patz, 1-21-2016 -------->
Gerrit topic: https:/
Addressed by: https:/
[WIP]Magnum asynchronous container operation
Gerrit topic: https:/
Addressed by: https:/
Spec for asynchronous container operations
Work Items
Dependency tree
* Blueprints in grey have been implemented.