Sunday, January 12, 2020

Ideas for a graceful shutdown of an application hosted on Kubernetes orchestration framework (K8s)
Software, that serves mission critical commercial deployments such as hosting the web pages for a company, is expected to run 24x7 without interruptions. Kubernetes orchestration framework, or K8s for short, enables this mode of operations by providing layered runtime, operating system, containers, pods and hosts dynamically so that even if there is a failure in one, another may come online without any disruption to traffic from customers. This is the expectation from a fault tolerant system.
Applications are written primarily for business requirements to serve the customer rather than the onus of being fault-tolerant. Applications hosted on K8s these days are also different from the applications written earlier for traditional systems where predecessors were monolithic, bound to immovable systems, prone to failures and often requiring external monitoring software. K8s has lightened the load on the application even doing away with the requirement for monitoring. Instead, applications can opt to sign up for SIGTERM messages so that they can shutdown gracefully.  Most applications have a logic built in for backups or uninstalls and the routine to handle this message is no different from them. The software to perform backup or uninstall of application is usually packaged with a tool or an installer that knows how to shut down the application – either forcefully with loss of data or gracefully by performing the steps that lets the application save its state prior to exit.
This article describes a few techniques for the graceful shutdown:
1. First the Kubernetes compliant way for a graceful shutdown:
The handling of the order of the shutdown of its components is the additional piece required to handle this message from the K8s framework. When this routine is written, it makes no difference if it is invoked by a command-line tool, installer or K8s. The application also provides a centralized command mechanism to execute this operation. Most of the processing in the components pertains to computing, storage and networking.
There are steps taken by K8s before sending this message to the application. This is referred to at the K8s termination lifecycle. First, it stops the pod from receiving more traffic from customers by setting it to a terminating state. Then it invokes the ‘preStop’ http request that is sent to the containers in a pod which is a way to allow application deployers to take action for termination when there is no application logic to handle the SIGTERM. K8s gives a grace period after ‘preStop’. This is usually about thirty seconds and starts counting immediately after sending the request. This grace period can be customized to a preset limit. There is no exponential back off in this case since shutdown request is imperative. Finally, after the expiration of the timer, the SIGKILL message is sent and the pod is forcibly removed. After this all the resources Kubernetes maintained for the pod are cleaned up.

No comments:

Post a Comment