Add horizontal scalability to Magnum conductors

Registered by Steven Dake

Copy ironic's hashing architecture for horizontal scalability. The idea is for API or conductor to be horizontally scaled. The initial implementation only supports one conductor.

Blueprint information

Status:
Complete
Approver:
Adrian Otto
Priority:
High
Drafter:
Steven Dake
Direction:
Approved
Assignee:
Hua Wang
Definition:
Approved
Series goal:
Accepted for kilo
Implementation:
Implemented
Milestone target:
milestone icon k3
Started by
hongbin
Completed by
Hua Wang

Related branches

Sprints

Whiteboard

Please add a T-Shirt size estimate for implementation of this feature (S, M, L, XL)
Large+

I think implementing this BP consists of two steps:
1. Lock bay (potentially other resources) to prevent race conditions due to multiple conductors.
2. Map bay to conductor.
For #1, it looks we can port related commits from Heat (https://blueprints.launchpad.net/heat/+spec/multiple-engines).
For #2, I don't think we have to use Ironic's hashing architecture. After studying their design notes (https://etherpad.openstack.org/p/IronicConsistentHashingForInstances), BP and commits, it looks Ironic uses consistent hashing to satisfy Ironic-specific requirement. In Magnum, I don't think we have to consistently hash a bay to a conductor. For example, bay A can be created by conductor A, and updated by conductor B (In Ironic, this is not desirable due to drivers restriction and performance considerations). Thoughts?
-- Hongbin

I dislike Heat's locking mechanism but it seems reliable and works. I studied it in great detail when I did the oslo.messaging port, and found it suffers from split brain scenarios (which a consistent hash does not). That said, pragmatically a consistent hash is more difficult to maintain and our conductor doesn't maintain state (the database maintains all the state), so the approach seems workable and preferable to me. --sdake

Gerrit topic: https://review.openstack.org/#q,topic:bp/horizontal-scale,n,z

Addressed by: https://review.openstack.org/171921
    [WIP] Support multiple conductors for scaling

Addressed by: https://review.openstack.org/172603
    Objects changes for horizontal-scale support

Addressed by: https://review.openstack.org/172772
    Implement listener API for conductor horizontal-scale

Addressed by: https://review.openstack.org/172773
    Implement baylock in conductor for horizontal-scale

Addressed by: https://review.openstack.org/172774
    [WIP] Utilize baylock for conductor horizontal-scale

Thanks for the good work getting things started. I've allocated the week of the 20th-24th to finish this work, but I think the scope is larger then I can complete by the 25th. I am going to go for simple but works and copy Heat's horizontal scale model which I can see has been mostly done in this review patchset. I may be looking at finishing the job on this blueprint during the RCs if that is acceptable to the team. Hongbin, can you suggest what further work is needed beyond the utilize baylock patch (the last one in the stream) --sdake

Etherpad: https://etherpad.openstack.org/p/liberty-work-magnum-horizontal-scale

As discussed in the Vancouver Summit, we are going to drop the bay lock implementation. Instead, each conductor will call Heat concurrently and rely on heat for concurrency control. However, I think we need an approach for state convergence from heat to magnum. Either periodic task [1] or heat notification [2] looks like a candidate.

[1] https://blueprints.launchpad.net/magnum/+spec/add-periodic-task
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-March/058898.html
--hongbin

When I thought about this previously I viewed polling or message driven status updates to be two mutually exclusive choices. Upon further consideration of the edge cases I think at both approaches should work together. So rather than choosing number 1 or number 2 above, we can do both. First do number 1 and follow that up with number 2 as an optimization so that polling can be done less frequently, and state transitions happen more quickly, but we can still converge on correct state from failure scenarios. --adrian_otto

Addressed by: https://review.openstack.org/210957
    Make simultaneous bay deletion workable

Addressed by: https://review.openstack.org/211004
    Make periodic tasks can run simultaneously

Addressed by: https://review.openstack.org/211024
    Make periodic tasks can run simultaneously

Addressed by: https://review.openstack.org/212922
    Fix race condition in bay_update

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.