Graceful shutdown services in Kubernetes

Yu Zhang

Published: March 18, 2022

As Continuous Deployment has been adopted in the software development process, the notion of start-stop gated deployments has switched to a non-fixed, high-frequency manner. Therefore, delivery teams must prioritize Zero Downtime Deployment capability to further reduce deployment risk. With Zero Downtime Deployment, your website or application is never down or in an unstable state throughout the deployment process. In Kubernetes, a set of mechanisms can help us implement Zero Downtime Deployment. In this article, we will focus on analyzing its graceful shutdown part.

Kubernetes, often abbreviated as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications. This article requires a certain grasp of the architecture and core components of Kubernetes. Please refer to Kubernetes Components for more information.

Identify the issues

In Kubernetes, every deployment means creating pods of a new version while removing old pods.

Two problems can arise if there is no graceful shutdown during the process:

A pod that is currently in the middle of processing a request is removed, which, if the request is not idempotent, leads to an inconsistent state.
Kubernetes routes traffic to pods that have already been deleted, resulting in failure of processing requests and poor user experience.

Analyze the issues

During the deletion of Kubernetes pods, there are two parallel timelines as shown in the following diagram. One is the timeline of changing network rules. The other is the deletion of the pod.

When the programmer or DeploymentPipeline executes the kubectl delete pod command, two procedures begin:

Network rules coming into effect:

Kube-apiserver receives the pod deletion request and updates the state of the pod to Terminating at Etcd;
Endpoint Controller deletes the IP of the pod from the Endpoint object;
Kuber-proxy updates the rules of iptables according to the change of the Endpoint object, and no longer routes traffic to the deleted pod.

Deleting a pod:

Kube-apiserver receives the pod deletion request and updates the state of the pod to Terminating at Etcd;
Kubelet cleans up container-related resources at the node, such as storage, network;
Kubelet sends SIGTERM to the container; if there are no configurations for the process within the container, the container will exit at once.
If the container didn’t exit within the default 30s, Kubelet will send SIGKILL and force it to exit.

By walking through the procedure of deleting a pod, we can see that if no configuration was set for the process within the container, the container will exit at once, leading to issue 1.

Since updating network rules and deleting pods takes place simultaneously, the network rules aren’t guaranteed to get updated before the deletion of the pods. And this is what might lead to issue 2.

The Solution

The following configurations can solve these problems:

Set the graceful shutdown for the process within the container;
Add preStopHook;
Modify terminationGracePeriodSeconds.

The following diagram shows the timeline after setup

For Issue 1: Setting graceful shutdown for the process within the container

Using SpringBoot as an example, enabling graceful shutdown is as easy as adding the correct setting in the Spring Boot config file.

server: 
    
    shutdown: graceful 

spring:  
    
    lifecycle:

         timeout-per-shutdown-phase: 30s

By using the above configuration, Spring Boot guarantees it will no longer accept new requests upon receiving SIGTERM and finishes processing all the ongoing requests within the timeout. Even if it is unable to finish in time, related information will still be logged then forced to quit. For the value of timeout, the maximum allowable duration to process a request should be referenced. In our experience, except under unusual circumstances, all requests generally finish processing within 30s. For those not finished within the defined timeout, we would capture the timeout in log monitoring and send alerts, then address the root cause of the timeout and take actions accordingly. This is how issue 1 can be solved. Other languages and frameworks should have similar configurations.

For Issue 2: Adding preStopHook

To handle issue 2, we have to begin to delete the pod after new traffic is no longer being routed to it. Hence, preStopHook should be added to the Kubernetes yaml file to let Kubelet “take a break” upon receiving the deleting pod event and to leave Kube-proxy abundant time to update the network rules before beginning to delete the pod.

lifecycle:
  
  preStop:
    
     exec:

        command: ["sh", "-c", "sleep 10"]  # set prestop hook

The above configuration, taken from the official Spring Boot document, will cause Kubelet to take the break we require.

Modifying terminationGracePeriodSeconds

Referring to the previous analytics of deleting pods, Kubernetes leaves a maximum timescale of 30 seconds for container deletion. If the sum of graceful shutdown timeout in Spring and preStopHooks in Kubernetes exceeds 30 seconds, it can lead to Kubernetes forcibly deleting the container before Spring Boot has finished processing requests. Therefore, if the procedure exceeds 30 seconds, the timerminationGracePeriodSeconds should be adjusted to be beyond the graceful shutdown timeout of Spring plus preStopHook.

terminationGracePeriodSeconds: 45

Finally, the fully updated Kubernetes yaml file looks like this:

apiVersion: apps/v1

kind: Deployment

metadata:
   name: graceful-shutdown-test-exit-graceful-30s

spec:

  replicas: 2

  selector:

     matchLabels:

           app: graceful-shutdown-test-exit-graceful-30s

  template:

    metadata:

       labels:

         app: graceful-shutdown-test-exit-graceful-30s

    spec:

      containers:

        - name: graceful-shutdown-test

          image: graceful-shutdown-test-exit-graceful-30s:latest

          ports:

            - containerPort: 8080

          lifecycle:

            preStop:

              exec:

                command: ["sh", "-c", "sleep 10"]  # set prestop hook

       terminationGracePeriodSeconds: 45    # terminationGracePeriodSeconds

Setting up the graceful shutdown in Spring Boot guarantees ongoing requests will be fully processed before the container is terminated. Setting up preStopHook confirms the sequential relationship between deleting pods and updating network rules. Finally, to leave abundant time for the process to handle all requests, we set terminationGracePeriodSeconds. By following these three steps, we can adequately solve both issues.

Summary

This article describes a solution for assuring that a hypothetical service will correctly handle all requests as required for Zero Downtime Deployments, meaning an environment where deployments are done frequently. Consequently, building this capability will enrich the user experience and decrease the impact of introducing defects into a service.

Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.