Enable health management customization for senlin

Registered by Cindia-blue

In the BP "support-health-management", actions of check and recover have been implemented. This BP will focus on how to expose these functions to Senlin users. Options now we can see include: REST/RPC API, Health_policy, and health_manager. We will connect these options with different use cases and enable accordingly.

Blueprint information

Status:
Complete
Approver:
Qiming Teng
Priority:
Medium
Drafter:
Cindia-blue
Direction:
Approved
Assignee:
Cindia-blue
Definition:
Approved
Series goal:
Proposed for mitaka
Implementation:
Implemented
Milestone target:
milestone icon newton-3
Started by
Cindia-blue
Completed by
Qiming Teng

Related branches

Sprints

Whiteboard

There are two types of actions as foundation of health management functions: check and recover.

There are three ways to trigger these actions:

(1) manual: expose the two actions as REST API so that users can manually request these
    actions.
(2) semi-auto: add listener to nova in health_manager. Once some nodes are dead, the
    check actions are triggered to see if any inconsistency with status stored in
    Senlin DB -- the check actions will update the DB automatically.
(3) automatic: in addition to auto-triggering of the check actions, we will trigger
    the recover actions periodically to recover nodes in ERROR status.

If automation is expected in (2) and (3) above, an admin should define a health
policy for their clusters which need special health management (auto-check and
auto-recover). If no policy is found attached, manual triggering of actions will be
the only choice.

Based on discussion with all core members, we will target to support manual and semi-auto cases in Mitaka-3. Will enable implements to automatic scenario once discussion with product line and customers settled down.

Gerrit topic: https://review.openstack.org/#q,topic:bp/support-health-management-customization,n,z

Addressed by: https://review.openstack.org/273866
    Add Node Check and Recover into API

Addressed by: https://review.openstack.org/#/c/273848/
    Adding check/recover actions to cluster nodes

Addressed by: https://review.openstack.org/#/c/273524/
    Add node check and recover into CLI

Addressed by: https://review.openstack.org/280677
    Add cluster check and recover into API

Addressed by: https://review.openstack.org/281619
    Revise node check and recover parameters

Addressed by: https://review.openstack.org/280735
   Adding Check/Recover Actions to Clusters in SDK

Addressed by: https://review.openstack.org/282299
    Enable Check and Conditional Recover in Health Manager

Addressed by: https://review.openstack.org/284707
    Add Registry Table for Health Management

Addressed by: https://review.openstack.org/285593
    Revise Health Policy for Health Management

Addressed by: https://review.openstack.org/292189
    Add cluster check and recover into CLI

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.