Tasks

Tasks
Administer a Cluster
Access Clusters Using the Kubernetes API
Access Services Running on Clusters
Advertise Extended Resources for a Node
Autoscale the DNS Service in a Cluster
Change the Reclaim Policy of a PersistentVolume
Change the default StorageClass
Cluster Management
Configure Multiple Schedulers
Configure Out Of Resource Handling
Configure Quotas for API Objects
Control CPU Management Policies on the Node
Customizing DNS Service
Debugging DNS Resolution
Declare Network Policy
Developing Cloud Controller Manager
Encrypting Secret Data at Rest
Guaranteed Scheduling For Critical Add-On Pods
IP Masquerade Agent User Guide
Kubernetes Cloud Controller Manager
Limit Storage Consumption
Namespaces Walkthrough
Operating etcd clusters for Kubernetes
Reconfigure a Node's Kubelet in a Live Cluster
Reserve Compute Resources for System Daemons
Safely Drain a Node while Respecting Application SLOs
Securing a Cluster
Set Kubelet parameters via a config file
Set up High-Availability Kubernetes Masters
Share a Cluster with Namespaces
Static Pods
Storage Object in Use Protection
Using CoreDNS for Service Discovery
Using a KMS provider for data encryption
Using sysctls in a Kubernetes Cluster
Extend kubectl with plugins
Manage HugePages
Schedule GPUs

Edit This Page

Safely Drain a Node while Respecting Application SLOs

This page shows how to safely drain a machine, respecting the application-level disruption SLOs you have specified using PodDisruptionBudget.

Before you begin

This task assumes that you have met the following prerequisites:

Use kubectl drain to remove a node from service

You can use kubectl drain to safely evict all of your pods from a node before you perform maintenance on the node (e.g. kernel upgrade, hardware maintenance, etc.). Safe evictions allow the pod’s containers to gracefully terminate and will respect the PodDisruptionBudgets you have specified.

Note: By default kubectl drain will ignore certain system pods on the node that cannot be killed; see the kubectl drain documentation for more details.

When kubectl drain returns successfully, that indicates that all of the pods (except the ones excluded as described in the previous paragraph) have been safely evicted (respecting the desired graceful termination period, and without violating any application-level disruption SLOs). It is then safe to bring down the node by powering down its physical machine or, if running on a cloud platform, deleting its virtual machine.

First, identify the name of the node you wish to drain. You can list all of the nodes in your cluster with

kubectl get nodes

Next, tell Kubernetes to drain the node:

kubectl drain <node name>

Once it returns (without giving an error), you can power down the node (or equivalently, if on a cloud platform, delete the virtual machine backing the node). If you leave the node in the cluster during the maintenance operation, you need to run

kubectl uncordon <node name>

afterwards to tell Kubernetes that it can resume scheduling new pods onto the node.

Draining multiple nodes in parallel

The kubectl drain command should only be issued to a single node at a time. However, you can run multiple kubectl drain commands for different nodes in parallel, in different terminals or in the background. Multiple drain commands running concurrently will still respect the PodDisruptionBudget you specify.

For example, if you have a StatefulSet with three replicas and have set a PodDisruptionBudget for that set specifying minAvailable: 2. kubectl drain will only evict a pod from the StatefulSet if all three pods are ready, and if you issue multiple drain commands in parallel, Kubernetes will respect the PodDisruptionBudget and ensure that only one pod is unavailable at any given time. Any drains that would cause the number of ready replicas to fall below the specified budget are blocked.

The Eviction API

If you prefer not to use kubectl drain (such as to avoid calling to an external command, or to get finer control over the pod eviction process), you can also programmatically cause evictions using the eviction API.

You should first be familiar with using Kubernetes language clients.

The eviction subresource of a pod can be thought of as a kind of policy-controlled DELETE operation on the pod itself. To attempt an eviction (perhaps more REST-precisely, to attempt to create an eviction), you POST an attempted operation. Here’s an example:

{
  "apiVersion": "policy/v1beta1",
  "kind": "Eviction",
  "metadata": {
    "name": "quux",
    "namespace": "default"
  }
}

You can attempt an eviction using curl:

curl -v -H 'Content-type: application/json' http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json

The API can respond in one of three ways:

For a given eviction request, there are two cases:

In some cases, an application may reach a broken state where it will never return anything other than 429 or 500. This can happen, for example, if the replacement pod created by the application’s controller does not become ready, or if the last pod evicted has a very long termination grace period.

In this case, there are two potential solutions:

Kubernetes does not specify what the behavior should be in this case; it is up to the application owners and cluster owners to establish an agreement on behavior in these cases.

What's next

Feedback