Healthchecks for OpenStack services

Registered by Michal Dulko

k8s provides pod healthchecking mechanism called livenessProbe. We can leverage
the mechanism to let k8s restart services pods when we detect that a pod may
have problems connecting to supporting services (MySQL, RabbitMQ), appears down
on service list or becomes unhealthy on a service-specific health checking
mechanism.

An initial idea is to create a generic Python script that will check connection
to the database and RabbitMQ. Apart from that every service may provide their
own healthchecking script that will run service-specific command (e.g. `nova
service-list` in case of Nova service or `keystone-manage doctor` to validate
Keystone config).

As healthchecks can potentially be dangerous for some kinds of deployments, we
should make them disabled by default and make it possible to enable them on
per-service basis.

This blueprint will remain as pending until an appropriate specification is drafted and agreed upon.

Blueprint information

Status:
Not started
Approver:
None
Priority:
Medium
Drafter:
Michal Dulko
Direction:
Needs approval
Assignee:
None
Definition:
Pending Approval
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/healthchecks,n,z

Addressed by: https://review.openstack.org/459591
    Initial generic healthcheck script

Addressed by: https://review.openstack.org/459592
    Liveness probe for Nova

Addressed by: https://review.openstack.org/468894
    Liveness probe for Glance

Addressed by: https://review.openstack.org/469123
    Liveness probe for Keystone

Addressed by: https://review.openstack.org/473886
    Liveness probe for Heat

Addressed by: https://review.openstack.org/474136
    Liveness probe for Cinder

Addressed by: https://review.openstack.org/475369
    Liveness probe for Neutron

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.