Healthchecks for OpenStack services
k8s provides pod healthchecking mechanism called livenessProbe. We can leverage
the mechanism to let k8s restart services pods when we detect that a pod may
have problems connecting to supporting services (MySQL, RabbitMQ), appears down
on service list or becomes unhealthy on a service-specific health checking
mechanism.
An initial idea is to create a generic Python script that will check connection
to the database and RabbitMQ. Apart from that every service may provide their
own healthchecking script that will run service-specific command (e.g. `nova
service-list` in case of Nova service or `keystone-manage doctor` to validate
Keystone config).
As healthchecks can potentially be dangerous for some kinds of deployments, we
should make them disabled by default and make it possible to enable them on
per-service basis.
This blueprint will remain as pending until an appropriate specification is drafted and agreed upon.
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Medium
- Drafter:
- Michal Dulko
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- Pending Approval
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
Initial generic healthcheck script
Addressed by: https:/
Liveness probe for Nova
Addressed by: https:/
Liveness probe for Glance
Addressed by: https:/
Liveness probe for Keystone
Addressed by: https:/
Liveness probe for Heat
Addressed by: https:/
Liveness probe for Cinder
Addressed by: https:/
Liveness probe for Neutron