21-pod-podLifecycle
concepts/workloads/pods/pod-lifecycle/
Pod Lifecycle
主要是描述Pods生命周期:
- Pod phase
- Pod conditions
- Container probes探针
- Pod and Container status
- Container States
- Pod readiness gate准备状态
- Restart policy重启策略
- Pod lifetime
- Examples
- What's next
Pod phase
A Pod’s status
field is a PodStatus object, which has a phase
field. 一个Podsstatus
字段是一个PodStatus对象,它有一个phase
字段
The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle. The phase is not intended to be a comprehensive rollup of observations of Container or Pod state, nor is it intended to be a comprehensive state machine.podphase
是对pod在其生命周期中所处位置的简单、高级总结。该阶段不打算是对容器或pod状态观测的综合汇总,也不打算是一个综合状态机。
The number and meanings of Pod phase values are tightly guarded. Other than what is documented here, nothing should be assumed about Pods that have a given phase
value.
pod相位值的数量和意义受到严格保护。除了这里记录的内容外,对于具有给定相位值的pod,不应假设任何内容。
以下是可能的值 phase
:
Value | Description |
---|---|
Pending |
The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. This includes time before being scheduled as well as time spent downloading images over the network, which could take a while.该pod 已被kubernetes系统接受,但一个或多个容器镜像尚未创建。这包括预定之前的时间以及通过网络下载图像所花费的时间,这可能需要一段时间。 |
Running |
The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting.pod已绑定到一个节点,并且所有容器都已创建。至少有一个容器仍在运行,或者正在启动或重新启动。 |
Succeeded |
All Containers in the Pod have terminated in success, and will not be restarted.POD中的所有容器都已成功终止,并且不会重新启动。 |
Failed |
All Containers in the Pod have terminated, and at least one Container has terminated in failure. That is, the Container either exited with non-zero status or was terminated by the system. pod中的所有容器都已终止,并且至少有一个容器因故障而终止。也就是说,容器要么退出非零状态,要么被系统终止。 |
Unknown |
For some reason the state of the Pod could not be obtained, typically due to an error in communicating with the host of the Pod. 由于某种原因,无法获得pod的状态,通常是由于与pod主机通信时出错。 |
Pod conditions
A Pod has a PodStatus, which has an array of) through which the Pod has or has not passed. Each element of the PodCondition array has six possible fields:一个pod有一个PodStatus
,它有一系列 PodConditions,pod已经或没有通过这些PodCondition。PodCondition为一个数组的每个元素都有六个可能的字段:
- The
lastProbeTime
field provides a timestamp for when the Pod condition was last probed. 提供上次探测POD条件的时间戳。 - The
lastTransitionTime
field provides a timestamp for when the Pod last transitioned from one status to another.提供POD上次从一种状态转换到另一种状态的时间戳。 - The
message
field is a human-readable message indicating details about the transition.是一条人类可读的消息,指示有关转换的详细信息。 - The
reason
field is a unique, one-word, CamelCase reason for the condition’s last transition. 是一个独特的,一个词,CamelCase病例的原因,为条件的最后过渡。 - The
status
field is a string, with possible values “True
”, “False
”, and “Unknown
”. - The
type
field is a string with the following possible values:PodScheduled
: the Pod has been scheduled to a node; pod已被调度到一个节点Ready
: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services; POD能够服务请求,并且应该被添加到所有匹配服务的负载平衡池中。Initialized
: all init containers have started successfully 所有容器初始化成功Unschedulable
: the scheduler cannot schedule the Pod right now, for example due to lack of resources or other constraints 调度程序现在无法调度pod,例如由于缺少资源或其他限制ContainersReady
: all containers in the Pod are ready. pod里的所有集装箱都准备好了
Container probes
探针 在容器上定期进行的诊断,为了执行诊断,kubelet调用由容器实现的处理程序 Handler 。有三种类型的处理程序:
- ExecAction: Executes a specified command inside the Container. The diagnostic is considered successful if the command exits with a status code of 0.
- TCPSocketAction: Performs a TCP check against the Container’s IP address on a specified port. The diagnostic is considered successful if the port is open.
- HTTPGetAction: Performs an HTTP Get request against the Container’s IP address on a specified port and path. The diagnostic is considered successful if the response has a status code greater than or equal to 200 and less than 400.
Each probe has one of three results:
- Success: The Container passed the diagnostic. 容器通过了诊断
- Failure: The Container failed the diagnostic. 容器未能通过诊断
- Unknown: The diagnostic failed, so no action should be taken. 诊断失败,不采取任何操作
The kubelet can optionally perform and react to three kinds of probes on running Containerskubelet可以选择在运行的容器上执行和响应三种探针:
livenessProbe
: 指示容器是否正在运行。如果活性探针失败,kubelet会杀死容器,容器会受到 restart policy. 如果容器不提供活动探测,则默认状态为Success
.readinessProbe
: Indicates whether the Container is ready to service requests. If the readiness probe fails, the endpoints controller removes the Pod’s IP address from the endpoints of all Services that match the Pod. The default state of readiness before the initial delay isFailure
. If a Container does not provide a readiness probe, the default state isSuccess
.指示容器是否已准备好为请求提供服务。如果就绪性探测失败,端点控制器将从与pod匹配的所有服务的端点移除pod的ip地址。初始延迟之前的默认就绪状态是failure。如果容器未提供就绪探测,则默认状态为success。startupProbe
: Indicates whether the application within the Container is started. All other probes are disabled if a startup probe is provided, until it succeeds. If the startup probe fails, the kubelet kills the Container, and the Container is subjected to its restart policy. If a Container does not provide a startup probe, the default state isSuccess
. 指示是否启动容器中的应用程序。如果提供了启动探测,则禁用所有其他探测,直到成功为止。如果启动探测失败,kubelet将杀死容器,容器将受其重新启动策略的约束。如果容器不提供启动探测,则默认状态为success。
When should you use a liveness probe?
FEATURE STATE: Kubernetes v1.0
stable
If the process in your Container is able to crash on its own whenever it encounters an issue or becomes unhealthy, you do not necessarily need a liveness probe; the kubelet will automatically perform the correct action in accordance with the Pod’s restartPolicy
. 如果容器中的进程在遇到问题或变得不健康时能够自行崩溃,则不一定需要活动探测器;kubelet将根据pod的重新启动策略自动执行正确的操作。
If you’d like your Container to be killed and restarted if a probe fails, then specify a liveness probe, and specify a restartPolicy
of Always or OnFailure. 如果您希望容器被杀死并在探针失败时重新启动,则指定一个活动性探针,并指定一个始终或OnDebug的重新启动策略。
When should you use a readiness probe?
FEATURE STATE: Kubernetes v1.0
stable
If you’d like to start sending traffic to a Pod only when a probe succeeds, specify a readiness probe. In this case, the readiness probe might be the same as the liveness probe, but the existence of the readiness probe in the spec means that the Pod will start without receiving any traffic and only start receiving traffic after the probe starts succeeding. If your Container needs to work on loading large data, configuration files, or migrations during startup, specify a readiness probe. 如果只想在探测成功时才开始向pod发送流量,请指定就绪探测。在这种情况下,就绪探测可能与活动探测相同,但规范中存在就绪探测意味着POD将在不接收任何通信量的情况下启动,并且仅在探测开始成功后才开始接收通信量。如果容器需要在启动期间加载大数据、配置文件或迁移,请指定就绪探测。
If you want your Container to be able to take itself down for maintenance, you can specify a readiness probe that checks an endpoint specific to readiness that is different from the liveness probe. 如果您希望容器能够自行停机进行维护,则可以指定一个就绪探测,该探测检查特定于就绪的端点,该端点不同于活动探测。
Note that if you just want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the Containers in the Pod to stop. 注意,如果您只想在pod被删除时排出请求,则不一定需要就绪探测;在删除时,pod会自动将自己置于未就绪状态,而不管就绪探测是否存在。当POD等待POD中的容器停止时,POD保持未就绪状态。
When should you use a startup probe?
FEATURE STATE: Kubernetes v1.16
alpha
If your Container usually starts in more than initialDelaySeconds + failureThreshold × periodSeconds
, you should specify a startup probe that checks the same endpoint as the liveness probe. The default for periodSeconds
is 30s. You should then set its failureThreshold
high enough to allow the Container to start, without changing the default values of the liveness probe. This helps to protect against deadlocks. 如果容器的启动时间通常超过initialDelaySeconds+failureReshold×periodSeconds,则应指定一个启动探测,该探测检查与活动探测相同的端点。PeriodSeconds的默认值为30s。然后应将其FailuReshold设置得足够高,以允许容器启动,而不更改Liveness Probe的默认值。这有助于防止死锁。
For more information about how to set up a liveness, readiness, startup probe, see Configure Liveness, Readiness and Startup Probes. 有关如何设置活跃性、准备性、启动探针的更多信息,请参见配置活动性、准备性和启动探针。
Pod and Container status
For detailed information about Pod Container status, see PodStatus and ContainerStatus. Note that the information reported as Pod status depends on the current ContainerState. 有关pod容器状态的详细信息,请参阅pod status和containerstatus。请注意,报告为POD状态的信息取决于当前的容器状态。
Container States
Once Pod is assigned to a node by scheduler, kubelet starts creating containers using container runtime.There are three possible states of containers: Waiting, Running and Terminated. To check state of container, you can use kubectl describe pod [POD_NAME]
. State is displayed for each container within that Pod. 一旦pod被调度器分配给一个节点,kubelet就开始使用container runtime创建容器。容器有三种可能的状态:等待、运行和终止。要检查容器的状态,可以使用kubectl describe pod[pod_name]。显示该舱内每个容器的状态。
-
Waiting
: Default state of container. If container is not in either Running or Terminated state, it is in Waiting state. A container in Waiting state still runs its required operations, like pulling images, applying Secrets, etc. Along with this state, a message and reason about the state are displayed to provide more information. 容器的默认状态。如果容器未处于运行或终止状态,则它处于等待状态。处于等待状态的容器仍在运行其所需的操作,如提取镜像、加密配置文件等。在该状态下,将显示有关该状态的消息和原因,以提供更多信息。... State: Waiting Reason: ErrImagePull ...
-
Running
: Indicates that the container is executing without issues. Once a container enters into Running,postStart
hook (if any) is executed. This state also displays the time when the container entered Running state. 指示容器正在无问题地执行。一旦容器进入运行状态,就会执行“poststart”钩子(如果有的话)。此状态还显示容器进入运行状态的时间。
...
State: Running
Started: Wed, 30 Jan 2019 16:46:38 +0530
...
Terminated
: Indicates that the container completed its execution and has stopped running. A container enters into this when it has successfully completed execution or when it has failed for some reason. Regardless, a reason and exit code is displayed, as well as the container’s start and finish time. Before a container enters into Terminated,preStop
hook (if any) is executed. 指示容器已完成其执行并已停止运行。容器在成功完成执行或由于某种原因失败时进入此状态。无论如何,将显示原因和退出代码,以及容器的开始和结束时间。在容器进入终止状态之前,将执行“prestop”钩子(如果有的话)。
...
State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 30 Jan 2019 11:45:26 +0530
Finished: Wed, 30 Jan 2019 11:45:26 +0530
...
Pod readiness gate
FEATURE STATE: Kubernetes v1.14
stable
In order to add extensibility to Pod readiness by enabling the injection of extra feedback or signals into PodStatus
, Kubernetes 1.11 introduced a feature named Pod ready++. You can use the new field ReadinessGate
in the PodSpec
to specify additional conditions to be evaluated for Pod readiness. If Kubernetes cannot find such a condition in the status.conditions
field of a Pod, the status of the condition is default to “False
”. Below is an example: 为了通过向podstatus注入额外的反馈或信号来增加pod就绪性的可扩展性,kubernetes 1.11引入了一个名为pod ready++的特性。您可以使用podspec中的新字段readinessgate来指定要评估pod就绪性的附加条件。如果kubernetes在pod的status.conditions字段中找不到这样的条件,则该条件的状态默认为“false”。下面是一个例子:
Kind: Pod
...
spec:
readinessGates:
- conditionType: "www.example.com/feature-1"
status:
conditions:
- type: Ready # this is a builtin PodCondition
status: "False"
lastProbeTime: null
lastTransitionTime: 2018-01-01T00:00:00Z
- type: "www.example.com/feature-1" # an extra PodCondition
status: "False"
lastProbeTime: null
lastTransitionTime: 2018-01-01T00:00:00Z
containerStatuses:
- containerID: docker://abcd...
ready: true
...
The new Pod conditions must comply with Kubernetes label key format. Since the kubectl patch
command still doesn’t support patching object status, the new Pod conditions have to be injected through the PATCH
action using one of the KubeClient libraries.新的pod条件必须符合kubernetes标签密钥格式。由于kubectl patch命令仍然不支持修补对象状态,因此必须使用kubeclient库之一通过修补操作注入新的pod条件。
With the introduction of new Pod conditions, a Pod is evaluated to be ready only when both the following statements are true:引入新的POD条件后,只有当以下两个语句都为真时,才评估POD是否准备就绪:
- All containers in the Pod are ready.
- All conditions specified in
ReadinessGates
are “True
”.
To facilitate this change to Pod readiness evaluation, a new Pod condition ContainersReady
is introduced to capture the old Pod Ready
condition.为了便于对吊舱准备状态的评估,引入了一个新的pod状态containers ready来捕获旧的吊舱准备状态。
In K8s 1.11, as an alpha feature, the “Pod Ready++” feature has to be explicitly enabled by setting the PodReadinessGates
feature gate to true.
In K8s 1.12, the feature is enabled by default.
Restart policy
A PodSpec has a restartPolicy
field with possible values Always, OnFailure, and Never. The default value is Always. restartPolicy
applies to all Containers in the Pod. restartPolicy
only refers to restarts of the Containers by the kubelet on the same node. Exited Containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes, and is reset after ten minutes of successful execution. As discussed in the Pods document, once bound to a node, a Pod will never be rebound to another node. podspec有一个restartpolicy字段,其中可能有always、onfailure和never值。默认值始终为。restartpolicy适用于pod中的所有容器。restartpolicy仅指kubelet在同一节点上重新启动容器。由kubelet重新启动的已退出容器将以指数后退延迟(10s、20s、40s…)重新启动,上限为5分钟,并在成功执行10分钟后重置。正如pods文档中所讨论的,一旦绑定到一个节点,pod将永远不会反弹到另一个节点。
Pod lifetime
In general, Pods do not disappear until someone destroys them. This might be a human or a controller. The only exception to this rule is that Pods with a phase
of Succeeded or Failed for more than some duration (determined by terminated-pod-gc-threshold
in the master) will expire and be automatically destroyed.
一般来说,pod不会消失,直到有人摧毁它们。这可能是人或控制器。此规则的唯一例外是,阶段为“成功”或“失败”的pod超过一段时间(由主机中终止的pod gc阈值确定)将过期并自动销毁。
Three types of controllers are available:
- Use a Job for Pods that are expected to terminate, for example, batch computations. Jobs are appropriate only for Pods with
restartPolicy
equal to OnFailure or Never. 对预期终止的pod使用作业,例如批处理计算。作业仅适用于restartpolicy等于onfailure或never的pod。 - Use a ReplicationController, ReplicaSet, or Deployment for Pods that are not expected to terminate, for example, web servers. ReplicationControllers are appropriate only for Pods with a
restartPolicy
of Always. - Use a DaemonSet for Pods that need to run one per machine, because they provide a machine-specific system service.
All three types of controllers contain a PodTemplate. It is recommended to create the appropriate controller and let it create Pods, rather than directly create Pods yourself. That is because Pods alone are not resilient to machine failures, but controllers are. 所有三种类型的控制器都包含一个PODE模板。建议创建适当的控制器并让它创建POD,而不是直接创建POD。这是因为豆荚本身对机器故障没有弹性,但控制器是有弹性的。
If a node dies or is disconnected from the rest of the cluster, Kubernetes applies a policy for setting the phase
of all Pods on the lost node to Failed. 如果一个节点问题或与集群的其余部分断开连接,kubernetes将应用一个策略,将丢失节点上所有pod的phase
设置为Failed。
Examples
Advanced liveness probe example 高级存活探测例子
活性探测由kubelet执行,因此所有请求都在kubelet网络名称空间中发出。
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http
spec:
containers:
- args:
- /server
image: k8s.gcr.io/liveness
livenessProbe:
httpGet:
# when "host" is not defined, "PodIP" will be used
# host: my-host
# when "scheme" is not defined, "HTTP" scheme will be used. Only "HTTP" and "HTTPS" are allowed
# scheme: HTTPS
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 15
timeoutSeconds: 1
name: liveness
Example states
-
Pod is running and has one Container. Container exits with success.
-
Log completion event.
-
If
restartPolicy
is: -
Always: Restart Container; Pod
phase
stays Running. -
OnFailure: Pod
phase
becomes Succeeded. -
Never: Pod
phase
becomes Succeeded.
-
-
Pod is running and has one Container. Container exits with failure.
-
Log failure event.
-
If
restartPolicy
is: -
Always: Restart Container; Pod
phase
stays Running. -
OnFailure: Restart Container; Pod
phase
stays Running. -
Never: Pod
phase
becomes Failed.
-
-
Pod is running and has two Containers. Container 1 exits with failure.
-
Log failure event.
-
If
restartPolicy
is: -
Always: Restart Container; Pod
phase
stays Running. -
OnFailure: Restart Container; Pod
phase
stays Running. -
Never: Do not restart Container; Pod
phase
stays Running. -
If Container 1 is not running, and Container 2 exits:
-
Log failure event.
-
If
restartPolicy
is:- Always: Restart Container; Pod
phase
stays Running. - OnFailure: Restart Container; Pod
phase
stays Running. - Never: Pod
phase
becomes Failed.
- Always: Restart Container; Pod
-
-
Pod is running and has one Container. Container runs out of memory.
-
Container terminates in failure.
-
Log OOM event.
-
If
restartPolicy
is: -
Always: Restart Container; Pod
phase
stays Running. -
OnFailure: Restart Container; Pod
phase
stays Running. -
Never: Log failure event; Pod
phase
becomes Failed.
-
-
Pod is running, and a disk dies.
- Kill all Containers.
- Log appropriate event.
- Pod
phase
becomes Failed. - If running under a controller, Pod is recreated elsewhere.
-
Pod is running, and its node is segmented out.
- Node controller waits for timeout.
- Node controller sets Pod
phase
to Failed. - If running under a controller, Pod is recreated elsewhere.
What's next
- Get hands-on experience attaching handlers to Container lifecycle events.
- Get hands-on experience Configure Liveness, Readiness and Startup Probes.
- Learn more about Container lifecycle hooks.
Feedback
Was this page helpful?
Yes
本作品采用《CC 协议》,转载必须注明作者和本文链接