Skip to content

Commit 43a8d5d

Browse files
danielvegamyhrealculquicondortengqm
authored
Update Job docs to include info about enabling pod-to-pod communication within a job using pod hostnames (#37771)
* Update Job docs to include info about using a headless service to enable pod communication via pod hostnames * Change section title * fix phrasing * update yaml example * update label selector * more specific phrasing * address comments and add new example * add note about pod dns policies * minor fixes * add link to job patterns * Update content/en/docs/tasks/job/intra-job-pod-networking-using-pod-hostnames.md Co-authored-by: Aldo Culquicondor <[email protected]> * Update content/en/docs/tasks/job/intra-job-pod-networking-using-pod-hostnames.md Co-authored-by: Aldo Culquicondor <[email protected]> * Update content/en/docs/tasks/job/intra-job-pod-networking-using-pod-hostnames.md Co-authored-by: Aldo Culquicondor <[email protected]> * Update content/en/docs/tasks/job/intra-job-pod-networking-using-pod-hostnames.md Co-authored-by: Aldo Culquicondor <[email protected]> * Update content/en/docs/concepts/workloads/controllers/job.md Co-authored-by: Aldo Culquicondor <[email protected]> * address comments * clarify sentence * move minikube note to prereqs * address comments * captitalize all instances of Job * move minikube notes to bottom of prereqs * address comments * update example * fix typo * update phrasing * link to this from the completion modes section of the job docs * address phrasing comments * add newlines to break up block of text * update phrasing * update phrasing * Update content/en/docs/concepts/workloads/controllers/job.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> * Update content/en/docs/tasks/job/job-with-pod-to-pod-communication.md Co-authored-by: Qiming Teng <[email protected]> Co-authored-by: Aldo Culquicondor <[email protected]> Co-authored-by: Qiming Teng <[email protected]>
1 parent 8997daa commit 43a8d5d

File tree

2 files changed

+144
-13
lines changed

2 files changed

+144
-13
lines changed

content/en/docs/concepts/workloads/controllers/job.md

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,8 @@ Jobs with _fixed completion count_ - that is, jobs that have non null
263263
- As part of the Pod hostname, following the pattern `$(job-name)-$(index)`.
264264
When you use an Indexed Job in combination with a
265265
{{< glossary_tooltip term_id="Service" >}}, Pods within the Job can use
266-
the deterministic hostnames to address each other via DNS.
266+
the deterministic hostnames to address each other via DNS. For more information about
267+
how to configure this, see [Job with Pod-to-Pod Communication](/docs/tasks/job/job-with-pod-to-pod-communication/).
267268
- From the containerized task, in the environment variable `JOB_COMPLETION_INDEX`.
268269

269270
The Job is considered complete when there is one successfully completed Pod
@@ -461,12 +462,13 @@ The tradeoffs are:
461462
The tradeoffs are summarized here, with columns 2 to 4 corresponding to the above tradeoffs.
462463
The pattern names are also links to examples and more detailed description.
463464

464-
| Pattern | Single Job object | Fewer pods than work items? | Use app unmodified? |
465-
| ----------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|
466-
| [Queue with Pod Per Work Item] | ✓ | | sometimes |
467-
| [Queue with Variable Pod Count] | ✓ | ✓ | |
468-
| [Indexed Job with Static Work Assignment] | ✓ | | ✓ |
469-
| [Job Template Expansion] | | | ✓ |
465+
| Pattern | Single Job object | Fewer pods than work items? | Use app unmodified? |
466+
| ----------------------------------------------- |:-----------------:|:---------------------------:|:-------------------:|
467+
| [Queue with Pod Per Work Item] | ✓ | | sometimes |
468+
| [Queue with Variable Pod Count] | ✓ | ✓ | |
469+
| [Indexed Job with Static Work Assignment] | ✓ | | ✓ |
470+
| [Job Template Expansion] | | | ✓ |
471+
| [Job with Pod-to-Pod Communication] | ✓ | sometimes | sometimes |
470472

471473
When you specify completions with `.spec.completions`, each Pod created by the Job controller
472474
has an identical [`spec`](https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status). This means that
@@ -477,17 +479,19 @@ are different ways to arrange for pods to work on different things.
477479
This table shows the required settings for `.spec.parallelism` and `.spec.completions` for each of the patterns.
478480
Here, `W` is the number of work items.
479481

480-
| Pattern | `.spec.completions` | `.spec.parallelism` |
481-
| ----------------------------------------- |:-------------------:|:--------------------:|
482-
| [Queue with Pod Per Work Item] | W | any |
483-
| [Queue with Variable Pod Count] | null | any |
484-
| [Indexed Job with Static Work Assignment] | W | any |
485-
| [Job Template Expansion] | 1 | should be 1 |
482+
| Pattern | `.spec.completions` | `.spec.parallelism` |
483+
| ----------------------------------------------- |:-------------------:|:--------------------:|
484+
| [Queue with Pod Per Work Item] | W | any |
485+
| [Queue with Variable Pod Count] | null | any |
486+
| [Indexed Job with Static Work Assignment] | W | any |
487+
| [Job Template Expansion] | 1 | should be 1 |
488+
| [Job with Pod-to-Pod Communication] | W | W |
486489

487490
[Queue with Pod Per Work Item]: /docs/tasks/job/coarse-parallel-processing-work-queue/
488491
[Queue with Variable Pod Count]: /docs/tasks/job/fine-parallel-processing-work-queue/
489492
[Indexed Job with Static Work Assignment]: /docs/tasks/job/indexed-parallel-processing-static/
490493
[Job Template Expansion]: /docs/tasks/job/parallel-processing-expansion/
494+
[Job with Pod-to-Pod Communication]: /docs/tasks/job/job-with-pod-to-pod-communication/
491495

492496
## Advanced usage
493497

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: Job with Pod-to-Pod Communication
3+
content_type: task
4+
min-kubernetes-server-version: v1.21
5+
weight: 30
6+
---
7+
8+
<!-- overview -->
9+
10+
In this example, you will run a Job in [Indexed completion mode](/blog/2021/04/19/introducing-indexed-jobs/) configured such that
11+
the pods created by the Job can communicate with each other using pod hostnames rather than pod IP addresses.
12+
13+
Pods within a Job might need to communicate among themselves. The user workload running in each pod could query the Kubernetes API server
14+
to learn the IPs of the other Pods, but it's much simpler to rely on Kubernetes' built-in DNS resolution.
15+
16+
Jobs in Indexed completion mode automatically set the pods' hostname to be in the format of
17+
`${jobName}-${completionIndex}`. You can use this format to deterministically build
18+
pod hostnames and enable pod communication *without* needing to create a client connection to
19+
the Kubernetes control plane to obtain pod hostnames/IPs via API requests.
20+
21+
This configuration is useful
22+
for use cases where pod networking is required but you don't want to depend on a network
23+
connection with the Kubernetes API server.
24+
25+
## {{% heading "prerequisites" %}}
26+
27+
You should already be familiar with the basic use of [Job](/docs/concepts/workloads/controllers/job/).
28+
29+
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
30+
31+
{{<note>}}
32+
If you are using MiniKube or a similar tool, you may need to take
33+
[extra steps](https://minikube.sigs.k8s.io/docs/handbook/addons/ingress-dns/)
34+
to ensure you have DNS.
35+
{{</note>}}
36+
37+
<!-- steps -->
38+
39+
## Starting a Job with Pod-to-Pod Communication
40+
41+
To enable pod-to-pod communication using pod hostnames in a Job, you must do the following:
42+
43+
1. Set up a [headless service](/docs/concepts/services-networking/service/#headless-services)
44+
with a valid label selector for the pods created by your Job. The headless service must be in the same namespace as
45+
the Job. One easy way to do this is to use the `job-name: <your-job-name>` selector, since the `job-name` label will be automatically added by Kubernetes. This configuration will trigger the DNS system to create records of the hostnames of
46+
the pods running your Job.
47+
48+
2. Configure the headless service as subdomain service for the Job pods by including the following value in your Job template spec:
49+
50+
```yaml
51+
subdomain: <headless-svc-name>
52+
```
53+
54+
### Example
55+
Below is a working example of a Job with pod-to-pod communication via pod hostnames enabled.
56+
The Job is completed only after all pods successfully ping each other using hostnames.
57+
58+
{{<note>}}
59+
In the Bash script executed on each pod in the example below, the pod hostnames can be prefixed
60+
by the namespace as well if the pod needs to be reached from outside the namespace.
61+
{{</note>}}
62+
63+
```yaml
64+
65+
apiVersion: v1
66+
kind: Service
67+
metadata:
68+
name: headless-svc
69+
spec:
70+
clusterIP: None # clusterIP must be None to create a headless service
71+
selector:
72+
job-name: example-job # must match Job name
73+
---
74+
apiVersion: batch/v1
75+
kind: Job
76+
metadata:
77+
name: example-job
78+
spec:
79+
completions: 3
80+
parallelism: 3
81+
completionMode: Indexed
82+
template:
83+
spec:
84+
subdomain: headless-svc # has to match Service name
85+
restartPolicy: Never
86+
containers:
87+
- name: example-workload
88+
image: bash:latest
89+
command:
90+
- bash
91+
- -c
92+
- |
93+
for i in 0 1 2
94+
do
95+
gotStatus="-1"
96+
wantStatus="0"
97+
while [ $gotStatus -ne $wantStatus ]
98+
do
99+
ping -c 1 example-job-${i}.headless-svc > /dev/null 2>&1
100+
gotStatus=$?
101+
if [ $gotStatus -ne $wantStatus ]; then
102+
echo "Failed to ping pod example-job-${i}.headless-svc, retrying in 1 second..."
103+
sleep 1
104+
fi
105+
done
106+
echo "Successfully pinged pod: example-job-${i}.headless-svc"
107+
done
108+
```
109+
110+
After applying the example above, reach each other over the network
111+
using: `<pod-hostname>.<headless-service-name>`. You should see output similar to the following:
112+
```shell
113+
kubectl logs example-job-0-qws42
114+
```
115+
116+
```
117+
Failed to ping pod example-job-0.headless-svc, retrying in 1 second...
118+
Successfully pinged pod: example-job-0.headless-svc
119+
Successfully pinged pod: example-job-1.headless-svc
120+
Successfully pinged pod: example-job-2.headless-svc
121+
```
122+
```
123+
{{<note>}}
124+
Keep in mind that the `<pod-hostname>.<headless-service-name>` name format used
125+
in this example would not work with DNS policy set to `None` or `Default`.
126+
You can learn more about pod DNS policies [here](/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy).
127+
{{</note>}}

0 commit comments

Comments
 (0)