Readiness and livenss probes for the CNI daemon
The CNI daemon is on its way to become the default CNI component in kuryr-kubernetes deployments. The way it is usually deployed is via Daemonsets. Thus, it is increasingly necessary to give kubernetes tools to know when CNI is in a good shape so that it can restart when problems arise. The best way to do that is by means of liveness and readiness checks.
For the CNI daemon to be healthy at least the following must be true:
- NET_ADMIN capabilities present
- Depending on the vif binding, ovs br-int present
- IPDB in working order. It would be nice if somehow we could detect leaks and mark unhealthy if it gets out of hand
- Connection to Kubernetes API for the Watch.
- Probably a configurable maximum of CNI ADD failures should mark the CNI as unhealthy so that it is restarted.
Blueprint information
- Status:
- Complete
- Approver:
- Daniel Mellado
- Priority:
- Undefined
- Drafter:
- Antoni Segura Puimedon
- Direction:
- Needs approval
- Assignee:
- Maysa de Macedo Souza
- Definition:
- New
- Series goal:
- None
- Implementation:
- Implemented
- Milestone target:
- None
- Started by
- Antoni Segura Puimedon
- Completed by
- Antoni Segura Puimedon
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
[WIP] Add readiness and liveness checks to CNI.
Gerrit topic: https:/
Addressed by: https:/
cni health: Avoid capsh dependency