Implement graceful shutdown for containers

Registered by Hiroki Ito

Some of OpenStack components (e.g. Nova, Cinder and Neutron) have been implemented graceful shutdown in order to shutdown their processes only after running requests have been finished. Without graceful shutdown, requests can be incompletely terminated, which may cause fatal errors such as inconsistency of resouce statuses, losing control of resouces and so forth.

However, at the moment, Kolla forcibly stops containerized processes by sending SIGKILL in 10 seconds after sending SIGTERM (This is the default behavior of "stop" method in docker-py library used in Kolla.). Therefore, graceful shutdown is not efficiently used in Kolla even though processes have that feature.

Thus I add a function that gracefully shutdown containerized processes and make sure that processes take benefit from graceful shutdown implemented in them. This additional function is useful when we need to stop containers such as when updating containers.

Blueprint information

Status:
Started
Approver:
None
Priority:
Undefined
Drafter:
Hiroki Ito
Direction:
Needs approval
Assignee:
Hiroki Ito
Definition:
Discussion
Series goal:
None
Implementation:
Slow progress
Milestone target:
None
Started by
Hiroki Ito

Related branches

Sprints

Whiteboard

Hiroki,

Sounds like a good goal and a feature we want to implement. Some of your facts are off though. docker-engine sends a sigterm after 10 seconds, followed by a sigkill 5 seconds later. Changing docker-py would have no effect and is an upstream dependency which would be difficult to change. I am not sure what the trigger mechanism is for graceful shutdown, but whatever it is - it should be implemented in kolla-docker.py. I see you have that in your work items - so thats good.

I am curious though - what happens in the case of something like a process stop failure? Do the above scenarios still occur? If so, it seems to me graceful shutdown is a hack around the real problems you presented. I'm all for improving the state of the system, especially around upgrades and reconfigure (which ideally shouldn't suffer from a process stop failure. I believe the proper place to add this is in reload.yml rather then action.yml. --sdake

Hi, Steven.
The trigger of graceful shutdown depends on processes. If oslo.services is used in the process, graceful shutdown is triggered by SIGTERM. I'm planning to send appropriate signals to containers by using "kill" method in docker-py, which is correspond with "docker kill" command. Even though we use graceful shutdown, there will be a possibility of a process stop failure and problems mentioned above will occur. However, termination of processes become far safer than kill forcibly. --hrito

Gerrit topic: https://review.openstack.org/#q,topic:bp/graceful-shutdown,n,z

Addressed by: https://review.openstack.org/391420
    Implement graceful shutdown function

(?)

Work Items

Work items:
Add flag for graceful shutdown in kolla-ansible: TODO
Add parameters in all.yml: TODO
Add graceful shutdown method in kolla-docker.py: TODO
Use graceful shutdown in each tasks: TODO

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.