34-Jobs Run to Completion

Jobs - Run to Completion

A Job creates one or more Pods and ensures that a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created. 作业创建一个或多个pod并确保指定数量的pod成功终止。当pods成功完成时,作业将跟踪成功的完成。当达到指定数量的成功完成时,任务(即作业)即完成。删除作业将清除它创建的播客。

A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot). 一个简单的例子是创建一个作业对象,以便可靠地运行一个pod以完成任务。如果第一个pod失败或被删除(例如由于节点硬件故障或节点重新启动),作业对象将启动一个新的pod。

You can also use a Job to run multiple Pods in parallel.

Running an example Job

Here is an example Job config. It computes π to 2000 places and prints it out. It takes around 10s to complete. 下面是一个作业配置示例。它计算π到2000个位置并打印出来。大约需要10秒才能完成。

controllers/job.yaml Copy controllers/job.yaml to clipboard

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

You can run the example with this command:

kubectl apply -f https://k8s.io/examples/controllers/job.yaml
job "pi" created

Check on the status of the Job with kubectl:

kubectl describe jobs/pi
Name:             pi
Namespace:        default
Selector:         controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495
Labels:           controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495
                  job-name=pi
Annotations:      <none>
Parallelism:      1
Completions:      1
Start Time:       Tue, 07 Jun 2016 10:56:16 +0200
Pods Statuses:    0 Running / 1 Succeeded / 0 Failed
Pod Template:
  Labels:       controller-uid=b1db589a-2c8d-11e6-b324-0209dc45a495
                job-name=pi
  Containers:
   pi:
    Image:      perl
    Port:
    Command:
      perl
      -Mbignum=bpi
      -wle
      print bpi(2000)
    Environment:        <none>
    Mounts:             <none>
  Volumes:              <none>
Events:
  FirstSeen    LastSeen    Count    From            SubobjectPath    Type        Reason            Message
  ---------    --------    -----    ----            -------------    --------    ------            -------
  1m           1m          1        {job-controller }                Normal      SuccessfulCreate  Created pod: pi-dtn4q

To view completed Pods of a Job, use kubectl get pods.

To list all the Pods that belong to a Job in a machine readable form, you can use a command like this: 要以机器可读的形式列出属于作业的所有pod,可以使用以下命令:

pods=$(kubectl get pods --selector=job-name=pi --output=jsonpath='{.items[*].metadata.name}')
echo $pods
pi-aiw0a

Here, the selector is the same as the selector for the Job. The --output=jsonpath option specifies an expression that just gets the name from each Pod in the returned list. 这里,选择器与作业的选择器相同。“--output=jsonpath”选项指定一个表达式,该表达式只从返回列表中的每个pod获取名称。

View the standard output of one of the pods:

kubectl logs $pods

The output is similar to this:

3.1415926535897932....

Writing a Job Spec

As with all other Kubernetes config, a Job needs apiVersion, kind, and metadata fields. 与所有其他kubernetes配置一样,作业需要“apiversion”、“kind”和“metadata”字段。

A Job also needs a .spec section.

Pod Template

The .spec.template is the only required field of the .spec.

The .spec.template is a pod template. It has exactly the same schema as a pod, except it is nested and does not have an apiVersion or kind. “.spec.template”是一个[pod模板](https://kubernetes.io/docs/concepts/worklo... overview/pod模板)。它的模式与[pod完全相同](https://kubernetes.io/docs/user guide/pods),只是它是嵌套的,没有“apiversion”或“kind”。

In addition to required fields for a Pod, a pod template in a Job must specify appropriate labels (see pod selector) and an appropriate restart policy. 除了pod所需的字段外,作业中的pod模板还必须指定适当的标签(请参阅[pod selector](https://kubernetes.io/docs/concepts/worklo... run to completion/pod selector))和适当的重新启动策略。

Only a RestartPolicy equal to Never or OnFailure is allowed.

Pod Selector

The .spec.selector field is optional. In almost all cases you should not specify it. See section “.spec.selector”字段是可选的。在几乎所有情况下,您都不应该指定它。见章节 specifying your own pod selector.

Parallel Jobs

There are three main types of task suitable to run as a Job: 有三种主要类型的任务适合作为作业运行:

  1. Non-parallel Jobs 非并行作业

    • normally, only one Pod is started, unless the Pod fails.
    • the Job is complete as soon as its Pod terminates successfully.
  2. Parallel Jobs with a fixed completion count: 具有固定完成计数的并行作业:

    • specify a non-zero positive value for .spec.completions.
    • the Job represents the overall task, and is complete when there is one successful Pod for each value in the range 1 to .spec.completions.
    • not implemented yet: Each Pod is passed a different index in the range 1 to .spec.completions.
  3. Parallel Jobs with a work queue: 具有工作队列的并行作业:

    • do not specify .spec.completions, default to .spec.parallelism.
    • the Pods must coordinate amongst themselves or an external service to determine what each should work on. For example, a Pod might fetch a batch of up to N items from the work queue. 吊舱必须相互协调,或者由外部服务来决定每个吊舱应该做什么。例如,一个pod可能从工作队列中获取一批最多n个项。
    • each Pod is independently capable of determining whether or not all its peers are done, and thus that the entire Job is done. 每个pod都能够独立地确定是否所有的对等点都完成了,从而完成了整个任务。
    • when any Pod from the Job terminates with success, no new Pods are created. 当作业中的any pod成功终止时,不会创建新的pod。
    • once at least one Pod has terminated with success and all Pods are terminated, then the Job is completed with success. 一旦至少有一个pod已成功终止,并且所有pod都已终止,则作业将成功完成。
    • once any Pod has exited with success, no other Pod should still be doing any work for this task or writing any output. They should all be in the process of exiting. 一旦任何POD已经成功,没有其他POD应该为这个任务或编写任何输出做任何工作。他们都应该在退出的过程中。

For a non-parallel Job, you can leave both .spec.completions and .spec.parallelism unset. When both are unset, both are defaulted to 1. 对于non-parallel作业,可以不设置.spec.completions.spec.parallelism。当两者都未设置时,两者都默认为1。

For a fixed completion count Job, you should set .spec.completions to the number of completions needed. You can set .spec.parallelism, or leave it unset and it will default to 1. 对于fixed completion count作业,应将“.spec.completions”设置为所需的完成数。您可以设置“.spec.parallelism”,或者不设置它,它将默认为1。

For a work queue Job, you must leave .spec.completions unset, and set .spec.parallelism to a non-negative integer. 对于工作队列作业,必须将“.spec.completions”保持未设置状态,并将“.spec.parallelism”设置为非负整数。

For more information about how to make use of the different types of job, see the job patterns section. 有关如何使用不同类型作业的更多信息,请参阅[作业模式](https://kubernetes.io/docs/concepts/worklo... run to completion/作业模式)部分。

Controlling Parallelism

The requested parallelism (.spec.parallelism) can be set to any non-negative value. If it is unspecified, it defaults to 1. If it is specified as 0, then the Job is effectively paused until it is increased. 请求的并行度(.spec.parallelism)可以设置为任何非负值。如果未指定,则默认为1。如果将其指定为0,则作业将有效暂停,直到其增加为止。

Actual parallelism (number of pods running at any instant) may be more or less than requested parallelism, for a variety of reasons: 实际并行性(在任何时刻运行的pod数)可能比请求的并行性多或少,原因有多种:

  • For fixed completion count Jobs, the actual number of pods running in parallel will not exceed the number of remaining completions. Higher values of .spec.parallelism are effectively ignored. 对于fixed completion count作业,并行运行的pod的实际数量不会超过剩余的完成数。更高的.spec.parallelism值实际上被忽略。
  • For work queue Jobs, no new Pods are started after any Pod has succeeded – remaining Pods are allowed to complete, however. 对于工作队列作业,任何pod成功后都不会启动新的pod,但允许完成剩余的pod。
  • If the Job Controller has not had time to react. 如果作业控制器没有时间作出反应。
  • If the Job controller failed to create Pods for any reason (lack of ResourceQuota, lack of permission, etc.), then there may be fewer pods than requested. 如果作业控制器由于任何原因(缺少“resourcequota”、缺少权限等)未能创建pod,则可能会有比请求的更少的pod。
  • The Job controller may throttle new Pod creation due to excessive previous pod failures in the same Job.
  • When a Pod is gracefully shut down, it takes time to stop.

Handling Pod and Container Failures

A container in a Pod may fail for a number of reasons, such as because the process in it exited with a non-zero exit code, or the container was killed for exceeding a memory limit, etc. If this happens, and the .spec.template.spec.restartPolicy = "OnFailure", then the Pod stays on the node, but the container is re-run. Therefore, your program needs to handle the case when it is restarted locally, or else specify .spec.template.spec.restartPolicy = "Never". See pod lifecycle for more information on restartPolicy. 一个容器中的容器可能由于多种原因而失效,例如因为它中的进程退出了非零退出代码,或者容器被杀死超过内存限制等。如果发生这种情况,并且.SPEC.PLATE.SPEC.ReStaseReald=“OnDebug”,那么POD将停留在节点上,但是容器被重新运行。因此,程序在本地重新启动时需要处理该情况,否则请指定.spec.template.spec.restartpolicy=“never”。有关RestartPolicy的更多信息,请参阅POD生命周期。

An entire Pod can also fail, for a number of reasons, such as when the pod is kicked off the node (node is upgraded, rebooted, deleted, etc.), or if a container of the Pod fails and the .spec.template.spec.restartPolicy = "Never". When a Pod fails, then the Job controller starts a new Pod. This means that your application needs to handle the case when it is restarted in a new pod. In particular, it needs to handle temporary files, locks, incomplete output and the like caused by previous runs. 整个pod也可能会失败,原因有很多,比如pod从节点启动时(节点被升级、重新启动、删除等),或者pod的容器出现故障并且.spec.template.spec.restartpolicy=“never”。当pod出现故障时,作业控制器会启动一个新的pod。这意味着当应用程序在新的pod中重新启动时,它需要处理这个情况。特别是,它需要处理以前运行所导致的临时文件、锁、不完整输出等。

Note that even if you specify .spec.parallelism = 1 and .spec.completions = 1 and .spec.template.spec.restartPolicy = "Never", the same program may sometimes be started twice.

If you do specify .spec.parallelism and .spec.completions both greater than 1, then there may be multiple pods running at once. Therefore, your pods must also be tolerant of concurrency.

Pod backoff failure policy

There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6. Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s …) capped at six minutes. The back-off count is reset if no new failed Pods appear before the Job’s next status check. 有些情况下,由于配置等方面的逻辑错误,您希望在重试一定数量后使作业失败。为此,请设置.spec.backofflimit以指定重试次数,然后再将作业视为失败。默认情况下,后退限制设置为6。与作业关联的失败播客由作业控制器重新创建,其指数后退延迟(10s、20s、40s…)限制为6分钟。如果在作业的下一次状态检查之前没有出现新的失败播客,则返回计数将重置。

Note: Issue #54870 still exists for versions of Kubernetes prior to version 1.12

Note: If your job has restartPolicy = "OnFailure", keep in mind that your container running the Job will be terminated once the job backoff limit has been reached. This can make debugging the Job’s executable more difficult. We suggest setting restartPolicy = "Never" when debugging the Job or using a logging system to ensure output from failed Jobs is not lost inadvertently.

Job Termination and Cleanup

When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too. 作业完成后,将不再创建播客,但也不会删除这些播客。保留它们允许您仍然查看已完成的pod的日志,以检查错误、警告或其他诊断输出。作业对象在完成后也将保留,以便您可以查看其状态。由用户在注意到旧作业的状态后删除它们。使用“kubectl”删除作业(例如“kubectl delete jobs/pi”或“kubectl delete-f./job.yaml”)。使用“kubectl”删除作业时,它创建的所有pod也将被删除。

By default, a Job will run uninterrupted unless a Pod fails (restartPolicy=Never) or a Container exits in error (restartPolicy=OnFailure), at which point the Job defers to the .spec.backoffLimit described above. Once .spec.backoffLimit has been reached the Job will be marked as failed and any running Pods will be terminated. 默认情况下,除非POD失败(“RealStasePrave= NORE”)或错误的容器退出(“RealStasePrase= OnReal'”),否则作业将不间断运行,在该点上,作业推迟到上面所描述的“.SPECUTRORBIMIT”。一旦达到“spec.backofflimit”,作业将被标记为失败,任何正在运行的pod都将终止。

Another way to terminate a Job is by setting an active deadline. Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds. The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created. Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job status will become type: Failed with reason: DeadlineExceeded. 另一种终止作业的方法是设置活动的截止日期。为此,请将作业的“.spec.activedeadlineseconds”字段设置为秒数。无论创建了多少pod,“activedeadlineseconds”都适用于作业的持续时间。一旦作业达到“activedeadlineseconds”,其所有正在运行的pod都将终止,作业状态将变为“type:failed”,并显示“reason:deadlineexceeded”。

Note that a Job’s .spec.activeDeadlineSeconds takes precedence over its .spec.backoffLimit. Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once it reaches the time limit specified by activeDeadlineSeconds, even if the backoffLimit is not yet reached. 请注意,作业的“.spec.activedeadlineseconds”优先于其“.spec.backofflimit”。因此,重试一个或多个失败播客的作业在达到“activedeadlineseconds”指定的时间限制后将不会部署其他播客,即使尚未达到“backofflimit”。

Example:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-timeout
spec:
  backoffLimit: 5
  activeDeadlineSeconds: 100
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

Note that both the Job spec and the Pod template spec within the Job have an activeDeadlineSeconds field. Ensure that you set this field at the proper level. 注意,作业规范和作业中的[pod模板规范](https://kubernetes.io/docs/concepts/worklo... containers/detailed behavior)都有一个“activedeadlineseconds”字段,确保将此字段设置为适当的级别。

Clean Up Finished Jobs Automatically

Finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such as CronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy. 系统中通常不再需要已完成的作业。把它们留在系统中会给api服务器带来压力。如果作业由更高级别的控制器直接管理,例如[cron jobs](https://kubernetes.io/docs/concepts/worklo... jobs/),则可以根据指定的基于容量的清理策略由cronjobs清理作业。

TTL Mechanism for Finished Jobs

FEATURE STATE: Kubernetes v1.12 alpha

Another way to clean up finished Jobs (either Complete or Failed) automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the .spec.ttlSecondsAfterFinished field of the Job. 自动清理已完成作业(complete或failed)的另一种方法是,通过指定作业的“.spec.ttlsecondsafterfinished”字段,对已完成资源使用TTL控制器提供的TTL机制。

When the TTL controller cleans up the Job, it will delete the Job cascadingly, i.e. delete its dependent objects, such as Pods, together with the Job. Note that when the Job is deleted, its lifecycle guarantees, such as finalizers, will be honored. 当ttl控制器清理作业时,它将级联删除作业,即删除其相关对象(如pods)和作业。请注意,当作业被删除时,它的生命周期保证(如终结器)将得到遵守。

For example:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-ttl
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

The Job pi-with-ttl will be eligible to be automatically deleted, 100 seconds after it finishes. 作业“pi with ttl”将有资格在完成后的“100”秒自动删除。

If the field is set to 0, the Job will be eligible to be automatically deleted immediately after it finishes. If the field is unset, this Job won’t be cleaned up by the TTL controller after it finishes. 如果该字段设置为“0”,则作业完成后将有资格立即自动删除。如果该字段未设置,则此作业在完成后将不会被ttl控制器清除。

Note that this TTL mechanism is alpha, with feature gate TTLAfterFinished. For more information, see the documentation for TTL controller for finished resources. 注意,这个ttl机制是alpha,功能门ttlaafterfinished。有关更多信息,请参阅ttl controller的文档以获取已完成的资源。

Job Patterns

The Job object can be used to support reliable parallel execution of Pods. The Job object is not designed to support closely-communicating parallel processes, as commonly found in scientific computing. It does support parallel processing of a set of independent but related work items. These might be emails to be sent, frames to be rendered, files to be transcoded, ranges of keys in a NoSQL database to scan, and so on. job对象可用于支持pods的可靠并行执行。job对象的设计并不像科学计算中常见的那样支持紧密通信的并行进程。它确实支持一组独立但相关的工作项的并行处理。这些可能是要发送的电子邮件、要呈现的帧、要转码的文件、要扫描的nosql数据库中的键范围,等等。

In a complex system, there may be multiple different sets of work items. Here we are just considering one set of work items that the user wants to manage together — a batch job. 在复杂系统中,可能有多组不同的工作项。这里我们只考虑一组用户希望一起管理的工作项-一个批处理作业

There are several different patterns for parallel computation, each with strengths and weaknesses. The tradeoffs are: 并行计算有几种不同的模式,每种模式各有优缺点。折衷方案是:

  • One Job object for each work item, vs. a single Job object for all work items. The latter is better for large numbers of work items. The former creates some overhead for the user and for the system to manage large numbers of Job objects. 每个工作项对应一个作业对象,而所有工作项对应一个作业对象。后者更适合于大量的工作项。前者为用户和系统管理大量作业对象带来了一些开销。
  • Number of pods created equals number of work items, vs. each Pod can process multiple work items. The former typically requires less modification to existing code and containers. The latter is better for large numbers of work items, for similar reasons to the previous bullet. 创建的pod数量等于工作项的数量,而每个pod可以处理多个工作项。前者通常需要对现有代码和容器进行较少的修改。后者更适合于大量的工作项,原因与前一个项目类似。
  • Several approaches use a work queue. This requires running a queue service, and modifications to the existing program or container to make it use the work queue. Other approaches are easier to adapt to an existing containerised application. 有几种方法使用工作队列。这需要运行队列服务,并对现有程序或容器进行修改以使其使用工作队列。其他方法更容易适应现有的容器化应用程序。

The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs. The pattern names are also links to examples and more detailed description. 这里总结了折衷方案,第2到4列对应于上述折衷方案。模式名也是指向示例和更详细描述的链接。

Pattern Single Job object Fewer pods than work items? Use app unmodified? Works in Kube 1.1?
Job Template Expansion
Queue with Pod Per Work Item sometimes
Queue with Variable Pod Count
Single Job with Static Work Assignment

When you specify completions with .spec.completions, each Pod created by the Job controller has an identical spec. This means that all pods for a task will have the same command line and the same image, the same volumes, and (almost) the same environment variables. These patterns are different ways to arrange for pods to work on different things. 使用“.spec.completions”指定完成时,作业控制器创建的每个pod都有一个相同的[spec](https://git.k8s.io/community/contributors/... architecture/api conventions.md spec and status)。这意味着一个任务的所有pod都将具有相同的命令行和相同的映像、相同的卷和(几乎)相同的环境变量。这些模式是不同的方式安排豆荚在不同的事情上工作。

This table shows the required settings for .spec.parallelism and .spec.completions for each of the patterns. Here, W is the number of work items. 下表显示了每个模式的.spec.parallelism.spec.completions所需的设置。这里,“w”是工作项的数目。

Pattern .spec.completions .spec.parallelism
Job Template Expansion 1 should be 1
Queue with Pod Per Work Item W any
Queue with Variable Pod Count 1 any
Single Job with Static Work Assignment W any

Advanced Usage

Specifying your own pod selector

Normally, when you create a Job object, you do not specify .spec.selector. The system defaulting logic adds this field when the Job is created. It picks a selector value that will not overlap with any other jobs. 通常,在创建作业对象时,不会指定.spec.selector。系统默认逻辑在创建作业时添加此字段,它选择的选择器值不会与任何其他作业重叠。

However, in some cases, you might need to override this automatically set selector. To do this, you can specify the .spec.selector of the Job. 但是,在某些情况下,可能需要重写此自动设置选择器。为此,可以指定作业的“.spec.selector”。

Be very careful when doing this. If you specify a label selector which is not unique to the pods of that Job, and which matches unrelated Pods, then pods of the unrelated job may be deleted, or this Job may count other Pods as completing it, or one or both Jobs may refuse to create Pods or run to completion. If a non-unique selector is chosen, then other controllers (e.g. ReplicationController) and their Pods may behave in unpredictable ways too. Kubernetes will not stop you from making a mistake when specifying .spec.selector. 做这件事时要非常小心。如果指定的标签选择器不是该作业的播客所独有的,并且与不相关的播客匹配,则可能会删除不相关作业的播客,或者此作业可能会将其他播客计为完成,或者一个或两个作业可能会拒绝创建播客或运行到完成。如果选择了非唯一选择器,则其他控制器(例如复制控制器)及其播客也可能以不可预知的方式运行。kubernetes不会阻止您在指定“.spec.selector”时出错。

Here is an example of a case when you might want to use this feature. 下面是一个您可能希望使用此功能的示例。

Say Job old is already running. You want existing Pods to keep running, but you want the rest of the Pods it creates to use a different pod template and for the Job to have a new name. You cannot update the Job because these fields are not updatable. Therefore, you delete Job old but leave its pods running, using kubectl delete jobs/old --cascade=false. Before deleting it, you make a note of what selector it uses: 说作业“old”已经在运行。您希望现有的POD继续运行,但您希望创建的POD的其余部分使用不同的POD模板,并使作业具有新的名称。无法更新作业,因为这些字段不可更新。因此,可以使用“kubectl delete jobs/old--cascade=false”删除作业“old”,但保持其pods运行。删除之前,请记下它使用的选择器:

kubectl get job old -o yaml
kind: Job
metadata:
  name: old
  ...
spec:
  selector:
    matchLabels:
      controller-uid: a8f3d00d-c6d2-11e5-9f87-42010af00002
  ...

Then you create a new Job with name new and you explicitly specify the same selector. Since the existing Pods have label controller-uid=a8f3d00d-c6d2-11e5-9f87-42010af00002, they are controlled by Job new as well. 然后创建一个名为“new”的新作业,并显式指定同一选择器。由于现有的POD具有标签“控制器UID=A8F3D0D-C6D2-11E5-9F8742010AF9002”,所以它们也由作业“新”控制。

You need to specify manualSelector: true in the new Job since you are not using the selector that the system normally generates for you automatically. 您需要在新作业中指定“manualselector:true”,因为您没有使用系统通常自动为您生成的选择器。

kind: Job
metadata:
  name: new
  ...
spec:
  manualSelector: true
  selector:
    matchLabels:
      controller-uid: a8f3d00d-c6d2-11e5-9f87-42010af00002
  ...

The new Job itself will have a different uid from a8f3d00d-c6d2-11e5-9f87-42010af00002. Setting manualSelector: true tells the system to that you know what you are doing and to allow this mismatch. 新作业本身将具有与“a8f3d00d-c6d2-11e5-9f87-42010af00002”不同的uid。设置“manualselector:true”告诉系统您知道自己在做什么,并允许这种不匹配。

Alternatives

Bare Pods

When the node that a Pod is running on reboots or fails, the pod is terminated and will not be restarted. However, a Job will create new Pods to replace terminated ones. For this reason, we recommend that you use a Job rather than a bare Pod, even if your application requires only a single Pod. 当POD正在重新启动时运行的节点或发生故障时,POD将终止且不会重新启动。但是,一个作业将创建新的pod来替换终止的pod。因此,我们建议您使用作业而不是裸pod,即使您的应用程序只需要一个pod。

Replication Controller

Jobs are complementary to Replication Controllers. A Replication Controller manages Pods which are not expected to terminate (e.g. web servers), and a Job manages Pods that are expected to terminate (e.g. batch tasks). 作业是对[复制控制器](https://kubernetes.io/docs/user guide/replication controller)的补充。复制控制器管理不希望终止的pod(例如web服务器),作业管理希望终止的pod(例如批处理任务)。

As discussed in Pod Lifecycle, Job is only appropriate for pods with RestartPolicy equal to OnFailure or Never. (Note: If RestartPolicy is not set, the default value is Always.) 正如[pod lifecycle中所讨论的](https://kubernetes.io/docs/concepts/worklo... lifecycle/),“job”只适用于“restartpolicy”等于“onfailure”或“never”的pods。(注意:如果未设置“restartpolicy”,则默认值为“always”。)

Single Job starts Controller Pod

Another pattern is for a single Job to create a Pod which then creates other Pods, acting as a sort of custom controller for those Pods. This allows the most flexibility, but may be somewhat complicated to get started with and offers less integration with Kubernetes. 另一种模式是单个作业创建一个pod,然后创建其他pod,充当这些pod的某种自定义控制器。这允许最大的灵活性,但可能有些复杂,开始并提供较少的集成与Kubernetes。

One example of this pattern would be a Job which starts a Pod which runs a script that in turn starts a Spark master controller (see spark example), runs a spark driver, and then cleans up. 这种模式的一个例子是启动一个pod的作业,该pod运行一个脚本,然后启动一个spark主控制器(参见spark example),运行一个spark驱动程序,然后进行清理。

An advantage of this approach is that the overall process gets the completion guarantee of a Job object, but complete control over what Pods are created and how work is assigned to them. 这种方法的一个优点是,整个过程获得了作业对象的完成保证,但是可以完全控制创建哪些pod以及如何为它们分配工作。

Cron Jobs

You can use a CronJob to create a Job that will run at specified times/dates, similar to the Unix tool cron. 您可以使用cron job创建一个在指定时间/日期运行的作业,类似于unix工具“cron”。

Feedback

Was this page helpful?

k8s
本作品采用《CC 协议》,转载必须注明作者和本文链接
《L01 基础入门》
我们将带你从零开发一个项目并部署到线上,本课程教授 Web 开发中专业、实用的技能,如 Git 工作流、Laravel Mix 前端工作流等。
《L05 电商实战》
从零开发一个电商项目,功能包括电商后台、商品 & SKU 管理、购物车、订单管理、支付宝支付、微信支付、订单退款流程、优惠券等
讨论数量: 0
(= ̄ω ̄=)··· 暂无内容!

讨论应以学习和精进为目的。请勿发布不友善或者负能量的内容,与人为善,比聪明更重要!