Split CNI into its executable part and a long running daemon

Registered by Antoni Segura Puimedon

Currently Kuryr's CNi consists on an executable entry point that:

- Is called by Kubelet with CNI env vars set
- Starts a watch on the specific pod CNI requests to add to the Network
- Handles the events until it sees the vif annotation
- Plugs the requested vif (including device creation)

As you can see from the description above. If a kubernetes user starts 100 pods at once, that means that there will be 100 cni instances establishing new https connections to the K8s API. This will indubitably slow things down both at the master (more connections to handle) and at the workers.

This blueprint calls for splitting the CNI into two components:

- The CNI executable: This component should ideally be self standing and rely only on Python stdlib dependencies so it can just be dropped in the Host. Its task is to open a socket. You can see the following midonet example [0] The responsibilities it has are:
    * Translate CNI env vars to json
    * Send the request via unix domain socket to the CNI daemon
    * Get the answer
    * Form valid CNI response and return it to the kubelet

- The CNI daemon: This component will probably run as a daemonset on the host and will set up an http server on the unix domain socket. Its responsibilities are:
    * Create the Unix domain socket
    * Have REST endpoints for Health and to receive addNetwork and Delnetwork commands
    * Watch K8s API pod events and store the last seen of a pod
    * Get requests from CNI executable
    * Bind vifs either immediately if VIF info is available or do it `on_present`
    * Write result to the unix socket

[0] https://github.com/midonet/midonet/blob/7356f58a1ef6ccf1410c2b3ee171500b19f1e5b0/midolman/src/deb/bin/mm-ctl

Blueprint information

Status:
Complete
Approver:
Irena Berezovsky
Priority:
High
Drafter:
Antoni Segura Puimedon
Direction:
Approved
Assignee:
MichaƂ Dulko
Definition:
Approved
Series goal:
Accepted for pike
Implementation:
Implemented
Milestone target:
None
Started by
Antoni Segura Puimedon
Completed by
Antoni Segura Puimedon

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/cni-split-exec-daemon,n,z

Addressed by: https://review.openstack.org/480028
    [WIP] CNI Daemon template

Gerrit topic: https://review.openstack.org/#q,topic:cni_tests,n,z

Addressed by: https://review.openstack.org/508106
    CNI daemon unit tests

Addressed by: https://review.openstack.org/509380
    CNI Daemon documentation

Addressed by: https://review.openstack.org/511518
    Add error handling and logging to CNI daemon

Addressed by: https://review.openstack.org/515186
    CNI split - introducing CNI daemon

Addressed by: https://review.openstack.org/518024
    Support kuryr-daemon when running containerized

Addressed by: https://review.openstack.org/517406
    Prevent pyroute2.IPDB threads leaking

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.