Skip to content

Commit 44c0c10

Browse files
committed
Add how-to guides for basic DRA task
- Enable and set up DRA (for cluster admins) - Allocate devices with DRA (for workload operators)
1 parent c05d82f commit 44c0c10

File tree

4 files changed

+386
-4
lines changed

4 files changed

+386
-4
lines changed

content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ specific attributes. A ResourceClaim that references the DeviceClass can then
143143
request specific configurations within the DeviceClass.
144144

145145
To create a DeviceClass, see
146-
[Dynamically Allocate Devices to Workloads with DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/).
146+
[Set Up DRA in a Cluster](/docs/tasks/configure-pod-container/assign-resources/set-up-dra-cluster).
147147

148148
### ResourceClaims and ResourceClaimTemplates {#resourceclaims-templates}
149149

@@ -183,7 +183,7 @@ recommended because auto-generated ResourceClaims are bound to the lifetime of
183183
the Pod that triggered the generation.
184184

185185
To learn how to claim resources using one of these methods, see
186-
[Dynamically Allocate Devices to Workloads with DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/#claim-resources).
186+
[Allocate Devices to Workloads with DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/).
187187

188188
### ResourceSlice {#resourceslice}
189189

@@ -387,7 +387,7 @@ To use any of these features, you must also set up DRA in your clusters by
387387
enabling the DynamicResourceAllocation feature gate and the DRA
388388
{{< glossary_tooltip text="API groups" term_id="api-group" >}}. For more
389389
information, see
390-
[Set up DRA in the cluster](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/#set-up-dra-cluster)
390+
[Set up DRA in the cluster](/docs/tasks/configure-pod-container/assign-resources/set-up-dra-cluster/).
391391

392392
### Admin access {#admin-access}
393393

@@ -628,7 +628,8 @@ spec:
628628

629629
## {{% heading "whatsnext" %}}
630630

631-
- [Dynamically allocate devices to workloads using DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/)
631+
- [Set Up DRA in a Cluster](/docs/tasks/configure-pod-container/assign-resources/set-up-dra-cluster/)
632+
- [Allocate devices to workloads using DRA](/docs/tasks/configure-pod-container/assign-resources/allocate-devices-dra/)
632633
- For more information on the design, see the
633634
[Dynamic Resource Allocation with Structured Parameters](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
634635
KEP.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
title: "Assign Devices to Pods and Containers"
3+
description: Assign infrastructure resources to your Kubernetes workloads.
4+
weight: 30
5+
---
6+
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
---
2+
title: Allocate Devices to Workloads with DRA
3+
content_type: task
4+
min-kubernetes-server-version: v1.32
5+
weight: 20
6+
---
7+
{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
8+
9+
<!-- overview -->
10+
11+
This page shows you how to allocate devices to your Pods by using
12+
_dynamic resource allocation (DRA)_. These instructions are for workload
13+
operators. Before reading this page, familiarize yourself with how DRA works and
14+
with DRA terminology like
15+
{{< glossary_tooltip text="ResourceClaims" term_id="resourceclaim" >}} and
16+
{{< glossary_tooltip text="ResourceClaimTemplates" term_id="resourceclaimtemplate" >}}.
17+
For more information, see
18+
[Dynamic Resource Allocation (DRA)](/docs/concepts/scheduling-eviction/dynamic-resource-allocation/).
19+
20+
<!-- body -->
21+
22+
## About device allocation with DRA {#about-device-allocation-dra}
23+
24+
As a workload operator, you can _claim_ devices for your workloads by creating
25+
ResourceClaims or ResourceClaimTemplates. When you deploy your workload,
26+
Kubernetes and the device drivers find available devices, allocate them to your
27+
Pods, and place the Pods on nodes that can access those devices.
28+
29+
<!-- prerequisites -->
30+
31+
## {{% heading "prerequisites" %}}
32+
33+
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
34+
35+
* Ensure that your cluster admin has set up DRA, attached devices, and installed
36+
drivers. For more information, see
37+
[Set Up DRA in a Cluster](/docs/tasks/configure-pod-container/assign-resources/set-up-dra-cluster).
38+
39+
<!-- steps -->
40+
41+
## Identify devices to claim {#identify-devices}
42+
43+
Your cluster administrator or the device drivers create
44+
_{{< glossary_tooltip term_id="deviceclass" text="DeviceClasses" >}}_ that
45+
define categories of devices. You can claim devices by using
46+
{{< glossary_tooltip term_id="cel" >}} to filter for specific device properties.
47+
48+
Get a list of DeviceClasses in the cluster:
49+
50+
```shell
51+
kubectl get deviceclasses
52+
```
53+
The output is similar to the following:
54+
55+
```
56+
NAME AGE
57+
driver.example.com 16m
58+
```
59+
If you get a permission error, you might not have access to get DeviceClasses.
60+
Check with your cluster administrator or with the driver provider for available
61+
device properties.
62+
63+
## Claim resources {#claim-resources}
64+
65+
You can request resources from a DeviceClass by using
66+
{{< glossary_tooltip text="ResourceClaims" term_id="resourceclaim" >}}. To
67+
create a ResourceClaim, do one of the following:
68+
69+
* Manually create a ResourceClaim if you want multiple Pods to share access to
70+
the same devices, or if you want a claim to exist beyond the lifetime of a
71+
Pod.
72+
* Use a
73+
{{< glossary_tooltip text="ResourceClaimTemplate" term_id="resourceclaimtemplate" >}}
74+
to let Kubernetes generate and manage per-Pod ResourceClaims. Create a
75+
ResourceClaimTemplate if you want every Pod to have access to separate devices
76+
that have similar configurations. For example, you might want simultaneous
77+
access to devices for Pods in a Job that uses
78+
[parallel execution](/docs/concepts/workloads/controllers/job/#parallel-jobs).
79+
80+
If you directly reference a specific ResourceClaim in a Pod, that ResourceClaim
81+
must already exist in the cluster. If a referenced ResourceClaim doesn't exist,
82+
the Pod remains in a pending state until the ResourceClaim is created. You can
83+
reference an auto-generated ResourceClaim in a Pod, but this isn't recommended
84+
because auto-generated ResourceClaims are bound to the lifetime of the Pod that
85+
triggered the generation.
86+
87+
To create a workload that claims resources, select one of the following options:
88+
89+
{{< tabs name="claim-resources" >}}
90+
{{% tab name="ResourceClaimTemplate" %}}
91+
92+
Review the following example manifest:
93+
94+
{{% code_sample file="dra/resourceclaimtemplate.yaml" %}}
95+
96+
This manifest creates a ResourceClaimTemplate that requests devices in the
97+
`example-device-class` DeviceClass that match both of the following parameters:
98+
99+
* Devices that have a `driver.example.com/type` attribute with a value of
100+
`gpu`.
101+
* Devices that have `64Gi` of capacity.
102+
103+
To create the ResourceClaimTemplate, run the following command:
104+
105+
```shell
106+
kubectl apply -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml
107+
```
108+
109+
{{% /tab %}}
110+
{{% tab name="ResourceClaim" %}}
111+
112+
Review the following example manifest:
113+
114+
{{% code_sample file="dra/resourceclaim.yaml" %}}
115+
116+
This manifest creates ResourceClaim that requests devices in the
117+
`example-device-class` DeviceClass that match both of the following parameters:
118+
119+
* Devices that have a `driver.example.com/type` attribute with a value of
120+
`gpu`.
121+
* Devices that have `64Gi` of capacity.
122+
123+
To create the ResourceClaim, run the following command:
124+
125+
```shell
126+
kubectl apply -f https://k8s.io/examples/dra/resourceclaim.yaml
127+
```
128+
129+
{{% /tab %}}
130+
{{< /tabs >}}
131+
132+
## Request devices in workloads using DRA {#request-devices-workloads}
133+
134+
To request device allocation, specify a ResourceClaim or a ResourceClaimTemplate
135+
in the `resourceClaims` field of the Pod specification. Then, request a specific
136+
claim by name in the `resources.claims` field of a container in that Pod.
137+
You can specify multiple entries in the `resourceClaims` field and use specific
138+
claims in different containers.
139+
140+
1. Review the following example Job:
141+
142+
{{% code_sample file="dra/dra-example-job.yaml" %}}
143+
144+
Each Pod in this Job has the following properties:
145+
146+
* Makes a ResourceClaimTemplate named `separate-gpu-claim` and a
147+
ResourceClaim named `shared-gpu-claim` available to containers.
148+
* Runs the following containers:
149+
* `container0` requests the devices from the `separate-gpu-claim`
150+
ResourceClaimTemplate.
151+
* `container1` and `container2` share access to the devices from the
152+
`shared-gpu-claim` ResourceClaim.
153+
154+
1. Create the Job:
155+
156+
```shell
157+
kubectl apply -f https://k8s.io/examples/dra/dra-example-job.yaml
158+
```
159+
160+
## Clean up {#clean-up}
161+
162+
To delete the Kubernetes objects that you created in this task, follow these
163+
steps:
164+
165+
1. Delete the example Job:
166+
167+
```shell
168+
kubectl delete -f https://k8s.io/examples/dra/dra-example-job.yaml
169+
```
170+
171+
1. To delete your resource claims, run one of the following commands:
172+
173+
* Delete the ResourceClaimTemplate:
174+
175+
```shell
176+
kubectl delete -f https://k8s.io/examples/dra/resourceclaimtemplate.yaml
177+
```
178+
* Delete the ResourceClaim:
179+
180+
```shell
181+
kubectl delete -f https://k8s.io/examples/dra/resourceclaim.yaml
182+
```
183+
184+
## {{% heading "whatsnext" %}}
185+
186+
* [Learn more about DRA](/docs/concepts/scheduling-eviction/dynamic-resource-allocation)

0 commit comments

Comments
 (0)