25-pod-Disruptions

concepts/workloads/pods/disruptions/

Disruptions 干扰

This guide is for application owners who want to build highly available applications, and thus need to understand what types of Disruptions can happen to Pods. 本指南适用于希望构建高可用性应用程序的应用程序所有者,因此需要了解pod可能会发生哪些类型的中断。

It is also for Cluster Administrators who want to perform automated cluster actions, like upgrading and autoscaling clusters. 它也适用于希望执行自动群集操作(如升级和自动调整群集)的群集管理员。

Voluntary and Involuntary Disruptions 自愿和非自愿中断

Pods do not disappear until someone (a person or a controller) destroys them, or there is an unavoidable hardware or system software error. 直到有人(一个人或一个控制器)摧毁它们,或者出现不可避免的硬件或系统软件错误,pod才会消失。

We call these unavoidable cases involuntary disruptions to an application. Examples are 我们称这些不可避免的情况为应用程序的非自愿中断。例如:

  • a hardware failure of the physical machine backing the node 备份节点的物理计算机的硬件故障
  • cluster administrator deletes VM (instance) by mistake 群集管理器错误地删除虚拟机(实例)
  • cloud provider or hypervisor failure makes VM disappear 云提供程序或虚拟机监控程序故障使虚拟机消失
  • a kernel panic 核心恐慌
  • the node disappears from the cluster due to cluster network partition 由于群集网络分区,节点从群集中消失
  • eviction of a pod due to the node being out-of-resources. 由于节点资源不足而逐出pod。

Except for the out-of-resources condition, all these conditions should be familiar to most users; they are not specific to Kubernetes. 除了资源不足的情况外,大多数用户都应该熟悉所有这些情况;它们不是kubernetes特有的。

We call other cases voluntary disruptions. These include both actions initiated by the application owner and those initiated by a Cluster Administrator. Typical application owner actions include: 我们称其他案件为“自愿中断”。这些操作包括由应用程序所有者启动的操作和由群集管理员启动的操作。典型的应用程序所有者操作包括:

  • deleting the deployment or other controller that manages the pod 删除部署或管理pod的其他控制器
  • updating a deployment’s pod template causing a restart 更新部署的pod模板导致重新启动
  • directly deleting a pod (e.g. by accident) 直接删除POD(例如,意外删除)

Cluster Administrator actions include: 群集管理器操作包括:

  • Draining a node for repair or upgrade. 正在排出节点以进行修复或升级。
  • Draining a node from a cluster to scale the cluster down (learn about Cluster Autoscaling ). 从群集中排出节点以缩小群集(了解群集自动缩放)。
  • Removing a pod from a node to permit something else to fit on that node. 从一个节点上移除一个pod以允许在该节点上安装其他东西。

These actions might be taken directly by the cluster administrator, or by automation run by the cluster administrator, or by your cluster hosting provider. 这些操作可以由群集管理器直接执行,也可以由群集管理器运行的自动化或由群集宿主提供程序执行。

Ask your cluster administrator or consult your cloud provider or distribution documentation to determine if any sources of voluntary disruptions are enabled for your cluster. If none are enabled, you can skip creating Pod Disruption Budgets. 请询问群集管理员或咨询云提供商或分发文档,以确定是否为群集启用了任何自愿中断源。如果未启用,则可以跳过创建pod中断预算。

Caution: Not all voluntary disruptions are constrained by Pod Disruption Budgets. For example, deleting deployments or pods bypasses Pod Disruption Budgets. 警告:并非所有的自愿中断都受到POD中断预算的限制。例如,删除部署或pod会绕过pod中断预算。

Dealing with Disruptions

Here are some ways to mitigate involuntary disruptions 以下是一些减轻非自愿中断的方法:

  • Ensure your pod requests the resources it needs. 确保你的吊舱请求它需要的资源。
  • Replicate your application if you need higher availability. (Learn about running replicated stateless and stateful applications.) 如果需要更高的可用性,请复制应用程序。(了解如何运行复制的无状态和有状态应用程序。)
  • For even higher availability when running replicated applications, spread applications across racks (using anti-affinity) or across zones (if using a multi-zone cluster.) 为了在运行复制应用程序时获得更高的可用性,请跨机架(使用反关联)或跨区域(如果使用多区域群集)分布应用程序。

The frequency of voluntary disruptions varies. On a basic Kubernetes cluster, there are no voluntary disruptions at all. However, your cluster administrator or hosting provider may run some additional services which cause voluntary disruptions. For example, rolling out node software updates can cause voluntary disruptions. Also, some implementations of cluster (node) autoscaling may cause voluntary disruptions to defragment and compact nodes. Your cluster administrator or hosting provider should have documented what level of voluntary disruptions, if any, to expect. 自愿中断的频率各不相同。在一个基本的kubernetes集群上,根本就没有自愿的中断。但是,您的群集管理器或宿主提供程序可能会运行一些附加服务,从而导致自愿中断。例如,推出节点软件更新可能会导致自愿中断。此外,群集(节点)自动缩放的某些实现可能会导致自动中断碎片整理和压缩节点。您的集群管理员或主机提供商应该记录下预期的自愿中断(如果有的话)级别。

Kubernetes offers features to help run highly available applications at the same time as frequent voluntary disruptions. We call this set of features Disruption Budgets. kubernetes提供的特性有助于在频繁的自愿中断的同时运行高可用的应用程序。我们称这组功能为中断预算。

How Disruption Budgets Work 中断预算的工作原理

An Application Owner can create a PodDisruptionBudget object (PDB) for each application. A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions. For example, a quorum-based application would like to ensure that the number of replicas running is never brought below the number needed for a quorum. A web front end might want to ensure that the number of replicas serving load never falls below a certain percentage of the total. 应用程序所有者可以为每个应用程序创建一个podsruptionbudget对象(pdb)。pdb限制了一个复制应用程序的pod的数量,这些pod同时从自愿中断中减少。例如,基于仲裁的应用程序希望确保运行的副本数量永远不会低于仲裁所需的数量。Web前端可能希望确保为负载提供服务的副本数永远不会低于总副本数的某个百分比。

Cluster managers and hosting providers should use tools which respect Pod Disruption Budgets by calling the Eviction API instead of directly deleting pods or deployments. Examples are the kubectl drain command and the Kubernetes-on-GCE cluster upgrade script (cluster/gce/upgrade.sh). 集群管理器和宿主提供者应该使用尊重pod中断预算的工具,方法是调用逐出api,而不是直接删除pod或部署。例如kubectl drain命令和gce cluster upgrade脚本(cluster/gce/upgrade.sh)上的kubernetes。

When a cluster administrator wants to drain a node they use the kubectl drain command. That tool tries to evict all the pods on the machine. The eviction request may be temporarily rejected, and the tool periodically retries all failed requests until all pods are terminated, or until a configurable timeout is reached. 当集群管理员想要排出一个节点时,他们使用'kubectl drain'命令。那个工具试图把机器上所有的pod都赶走。收回请求可能会被暂时拒绝,工具会定期重试所有失败的请求,直到所有pod终止,或者直到达到可配置的超时。

A PDB specifies the number of replicas that an application can tolerate having, relative to how many it is intended to have. For example, a Deployment which has a .spec.replicas: 5 is supposed to have 5 pods at any given time. If its PDB allows for there to be 4 at a time, then the Eviction API will allow voluntary disruption of one, but not two pods, at a time. pdb指定一个应用程序可以容忍的副本数量,相对于它打算拥有的副本数量。例如,具有.spec.replicas:5的部署在任何给定时间都应该有5个pod。如果pdb允许一次有4个pod,那么逐出api将允许一次自愿中断一个pod,而不是两个pod。

The group of pods that comprise the application is specified using a label selector, the same as the one used by the application’s controller (deployment, stateful-set, etc). 组成应用程序的pod组是使用标签选择器指定的,与应用程序的控制器(部署、有状态集等)使用的标签选择器相同。

The “intended” number of pods is computed from the .spec.replicas of the pods controller. The controller is discovered from the pods using the .metadata.ownerReferences of the object. pods的“预期”数量是根据pods控制器的.spec.replicas计算的。控制器是使用对象的.metadata.owner引用从pods中发现的。

PDBs cannot prevent involuntary disruptions from occurring, but they do count against the budget. PDB无法防止非自愿中断的发生,但它们确实计入预算。

Pods which are deleted or unavailable due to a rolling upgrade to an application do count against the disruption budget, but controllers (like deployment and stateful-set) are not limited by PDBs when doing rolling upgrades – the handling of failures during application updates is configured in the controller spec. (Learn about updating a deployment.) 由于应用程序的滚动升级而被删除或不可用的播客将计入中断预算,但控制器(如部署和状态集)在进行滚动升级时不受pdb的限制,应用程序更新期间的故障处理在控制器规范中进行了配置(了解如何更新部署)。

When a pod is evicted using the eviction API, it is gracefully terminated (see terminationGracePeriodSeconds in PodSpec.) 当使用逐出api逐出pod时,该pod将被优雅地终止(请参阅podspec中的终止宽限期秒数)。

PDB Example

Consider a cluster with 3 nodes, node-1 through node-3. The cluster is running several applications. One of them has 3 replicas initially called pod-a, pod-b, and pod-c. Another, unrelated pod without a PDB, called pod-x, is also shown. Initially, the pods are laid out as follows: 考虑一个有3个节点的集群,节点1到节点3。群集正在运行多个应用程序。其中一个有3个最初称为pod-a、pod-b和pod-c的复制品,另一个没有pdb的无关pod,称为pod-x。最初,吊舱布置如下:

node-1 node-2 node-3
pod-a available pod-b available pod-c available
pod-x available

All 3 pods are part of a deployment, and they collectively have a PDB which requires there be at least 2 of the 3 pods to be available at all times. 所有3个pod都是部署的一部分,它们共同拥有一个pdb,该pdb要求3个pod中至少有2个始终可用。

For example, assume the cluster administrator wants to reboot into a new kernel version to fix a bug in the kernel. The cluster administrator first tries to drain node-1 using the kubectl drain command. That tool tries to evict pod-a and pod-x. This succeeds immediately. Both pods go into the terminating state at the same time. This puts the cluster in this state: 例如,假设集群管理员希望重新启动到新的内核版本以修复内核中的错误。群集管理器首先尝试使用“kubectl drain”命令排出“node-1”。那个工具试图逐出“pod-a”和“pod-x”。这很快就成功了。两个pod同时进入“终止”状态。这将使群集处于以下状态:

node-1 draining node-2 node-3
pod-a terminating pod-b available pod-c available
pod-x terminating

The deployment notices that one of the pods is terminating, so it creates a replacement called pod-d. Since node-1 is cordoned, it lands on another node. Something has also created pod-y as a replacement for pod-x. 部署会注意到其中一个pod正在终止,因此它会创建一个名为pod-d的替换项。由于node-1被封锁,因此它会落在另一个节点上。一些东西也创造了pod-y作为pod-x的替代品。

(Note: for a StatefulSet, pod-a, which would be called something like pod-0, would need to terminate completely before its replacement, which is also called pod-0 but has a different UID, could be created. Otherwise, the example applies to a StatefulSet as well.) (注意:对于statefulset,pod-a(类似于'pod-0)需要完全终止,然后才能创建其替换项(也称为'pod-0,但具有不同的uid)。否则,该示例也适用于statefulset。)

Now the cluster is in this state:

node-1 draining node-2 node-3
pod-a terminating pod-b available pod-c available
pod-x terminating pod-d starting pod-y

At some point, the pods terminate, and the cluster looks like this: 在某个点上,pods终止,集群如下所示:

node-1 drained node-2 node-3
pod-b available pod-c available
pod-d starting pod-y

At this point, if an impatient cluster administrator tries to drain node-2 or node-3, the drain command will block, because there are only 2 available pods for the deployment, and its PDB requires at least 2. After some time passes, pod-d becomes available. 此时,如果不耐烦的集群管理员试图耗尽node-2或node-3,则drain命令将被阻塞,因为只有2个pod可用于部署,而其pdb至少需要2个。经过一段时间后,POD-D变得可用。

The cluster state now looks like this: 群集状态现在如下所示:

node-1 drained node-2 node-3
pod-b available pod-c available
pod-d available pod-y

Now, the cluster administrator tries to drain node-2. The drain command will try to evict the two pods in some order, say pod-b first and then pod-d. It will succeed at evicting pod-b. But, when it tries to evict pod-d, it will be refused because that would leave only one pod available for the deployment. 现在,群集管理器尝试排出node-2。排水命令将尝试按一定顺序逐出两个吊舱,例如先逐出吊舱-B,然后逐出吊舱-D。它将成功逐出吊舱-B。但是,当它试图逐出吊舱-D时,将被拒绝,因为这将只留下一个吊舱可供部署。

The deployment creates a replacement for pod-b called pod-e. Because there are not enough resources in the cluster to schedule pod-e the drain will again block. The cluster may end up in this state: 部署创建了一个名为pod-e的pod-b的替代品。由于集群中没有足够的资源来调度pod-e,因此消耗将再次阻塞。群集可能最终处于以下状态:

node-1 drained node-2 node-3 no node
pod-b available pod-c available pod-e pending
pod-d available pod-y

At this point, the cluster administrator needs to add a node back to the cluster to proceed with the upgrade. 此时,群集管理器需要将节点添加回群集才能继续升级。

You can see how Kubernetes varies the rate at which disruptions can happen, according to: 您可以看到kubernetes如何改变中断发生的速率,根据:

  • how many replicas an application needs 一个应用程序需要多少副本
  • how long it takes to gracefully shutdown an instance 优雅地关闭实例需要多长时间
  • how long it takes a new instance to start up 新实例启动需要多长时间
  • the type of controller 控制器类型
  • the cluster’s resource capacity 群集的资源容量

Separating Cluster Owner and Application Owner Roles 分离群集所有者和应用程序所有者角色

Often, it is useful to think of the Cluster Manager and Application Owner as separate roles with limited knowledge of each other. This separation of responsibilities may make sense in these scenarios: 通常,将集群管理器和应用程序所有者视为相互了解有限的单独角色是有用的。这种职责分离在以下情况下可能是有意义的:

  • when there are many application teams sharing a Kubernetes cluster, and there is natural specialization of roles 当有许多应用程序团队共享一个kubernetes集群时,角色自然会专门化
  • when third-party tools or services are used to automate cluster management 当第三方工具或服务用于自动化群集管理时

Pod Disruption Budgets support this separation of roles by providing an interface between the roles. pod中断预算通过提供角色之间的接口来支持角色的分离。

If you do not have such a separation of responsibilities in your organization, you may not need to use Pod Disruption Budgets. 如果您的组织中没有这样的职责分离,您可能不需要使用pod中断预算。

How to perform Disruptive Actions on your Cluster 如何在群集上执行中断操作

If you are a Cluster Administrator, and you need to perform a disruptive action on all the nodes in your cluster, such as a node or system software upgrade, here are some options: 如果您是群集管理员,并且需要对群集中的所有节点执行中断操作,例如节点或系统软件升级,则以下是一些选项:

  • Accept downtime during the upgrade. 接受升级期间的停机时间。
  • Failover to another complete replica cluster. 故障转移到另一个完整的副本群集。
    • No downtime, but may be costly both for the duplicated nodes and for human effort to orchestrate the switchover. 没有停机时间,但对于复制的节点和人为安排切换的工作来说都可能代价高昂。
  • Write disruption tolerant applications and use PDBs. 编写抗中断应用程序并使用pdb。
    • No downtime. 没有停机时间。
    • Minimal resource duplication. 最少的资源重复。
    • Allows more automation of cluster administration. 允许更自动化的群集管理。
    • Writing disruption-tolerant applications is tricky, but the work to tolerate voluntary disruptions largely overlaps with work to support autoscaling and tolerating involuntary disruptions. 编写能够容忍中断的应用程序是很棘手的,但是容忍自愿中断的工作与支持自动缩放和容忍非自愿中断的工作在很大程度上是重叠的。

What's next

Feedback

Was this page helpful?

k8s
本作品采用《CC 协议》,转载必须注明作者和本文链接
《L04 微信小程序从零到发布》
从小程序个人账户申请开始,带你一步步进行开发一个微信小程序,直到提交微信控制台上线发布。
《G01 Go 实战入门》
从零开始带你一步步开发一个 Go 博客项目,让你在最短的时间内学会使用 Go 进行编码。项目结构很大程度上参考了 Laravel。
讨论数量: 0
(= ̄ω ̄=)··· 暂无内容!

讨论应以学习和精进为目的。请勿发布不友善或者负能量的内容,与人为善,比聪明更重要!