Convergence - robustness and scale by default

Registered by Robert Collins

Clouds are noisy - servers fail to come up, or die when the underlying hypervisor crashes or suffers a power failure.

Large stacks exceed the capacity of a single heat-engine process to update / manage efficiently.

Both these issues can be addressed by three interlocking changes:

 - move from using in-process-polling to observe resource state, to an observe-and-notify approach
 - move from a call-stack implementation to a continual-convergence implementation, triggered by change notifications
 - run each individual convergence step via taskflow via a distributed set of workers

Blueprint information

Status:
Complete
Approver:
None
Priority:
High
Drafter:
Clint Byrum
Direction:
Approved
Assignee:
None
Definition:
Superseded
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
Angus Salkeld

Related branches

Sprints

Whiteboard

(therve) It doesn't seem that it's a blueprint. Maybe the introduction to 3 other blueprints? It may be better as mail in the list or something.

Gerrit topic: https://review.openstack.org/#q,topic:spec/convergence,n,z

Addressed by: https://review.openstack.org/95907
    Convergence Specification

Gerrit topic: https://review.openstack.org/#q,topic:bp/convergence,n,z

Addressed by: https://review.openstack.org/106054
    Added UUID to stack table and int id as primary

Addressed by: https://review.openstack.org/109012
    Database model and apis for convergence

Gerrit topic: https://review.openstack.org/#q,topic:convergence-poc,n,z

Gerrit topic: https://review.openstack.org/#q,topic:convergence,n,z

Addressed by: https://review.openstack.org/152301
    Add a config option to enable Convergence

Addressed by: https://review.openstack.org/152302
    Convergence Database schema changes

Addressed by: https://review.openstack.org/152303
    Push instead of pull resource input data

Addressed by: https://review.openstack.org/152305
    Convergence base workflow

Addressed by: https://review.openstack.org/152304
    Convergence message bus

Addressed by: https://review.openstack.org/152307
    Convergence simulator test scenarios

Addressed by: https://review.openstack.org/152306
    Convergence special cases

Addressed by: https://review.openstack.org/156693
    Add extra columns for resource table

Gerrit topic: https://review.openstack.org/#q,topic:tripleo/heat/convergence,n,z

Addressed by: https://review.openstack.org/418583
    DNM: TripleO/Heat convergence testing

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.