You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Define a flavor for your node type and reference the topology.
119
+
120
+
Save the following as `resourceflavor.yaml`:
121
+
122
+
```yaml
123
+
apiVersion: kueue.x-k8s.io/v1beta1
124
+
kind: ResourceFlavor
125
+
metadata:
126
+
name: "tas-flavor"
127
+
spec:
128
+
nodeLabels:
129
+
node.kubernetes.io/instance-type: "BM.GPU.H100.8"
130
+
topologyName: "oci-topology"
131
+
```
132
+
133
+
Apply it:
134
+
135
+
```bash
136
+
kubectl apply -f resourceflavor.yaml
137
+
```
138
+
139
+
---
140
+
141
+
#### 3. Create a ClusterQueue
142
+
143
+
Define a shared queue of resources available to all namespaces.
144
+
145
+
Save the following as `clusterqueue.yaml`:
146
+
147
+
```yaml
148
+
apiVersion: kueue.x-k8s.io/v1beta1
149
+
kind: ClusterQueue
150
+
metadata:
151
+
name: "tas-cluster-queue"
152
+
spec:
153
+
namespaceSelector: {}
154
+
resourceGroups:
155
+
- coveredResources: ["cpu", "memory"]
156
+
flavors:
157
+
- name: "tas-flavor"
158
+
resources:
159
+
- name: "cpu"
160
+
nominalQuota: 100
161
+
- name: "memory"
162
+
nominalQuota: 100Gi
163
+
```
164
+
165
+
Apply it:
166
+
167
+
```bash
168
+
kubectl apply -f clusterqueue.yaml
169
+
```
170
+
171
+
---
172
+
173
+
#### 4. Create a LocalQueue
174
+
175
+
Create a namespace-specific queue linked to the cluster queue.
176
+
177
+
Save the following as `localqueue.yaml`:
178
+
179
+
```yaml
180
+
apiVersion: kueue.x-k8s.io/v1beta1
181
+
kind: LocalQueue
182
+
metadata:
183
+
name: tas-user-queue
184
+
spec:
185
+
clusterQueue: tas-cluster-queue
186
+
```
187
+
188
+
Apply it:
189
+
190
+
```bash
191
+
kubectl apply -f localqueue.yaml
192
+
```
193
+
194
+
---
195
+
196
+
#### 5. Run an Example Job
197
+
198
+
The annotation `kueue.x-k8s.io/podset-preferred-topology` tells Kueue to **prefer placing all pods within the same topology domain**. If that is not possible, Kueue will progressively move up the hierarchy until it finds a level where the job fits. If no level can contain all pods, they are distributed across multiple topology domains.
You can use the labels explained above to create affinity rules for your workloads. Visit [this link](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/) if you want to learn more about using affinity rules on Kubernetes.
59
237
@@ -189,95 +367,7 @@ spec:
189
367
190
368
```
191
369
192
-
### Using Kueue
193
-
You will need to [enable the feature gate](https://kueue.sigs.k8s.io/docs/installation/#change-the-feature-gates-configuration) for [Topology Aware Scheduling (TAS)](https://kueue.sigs.k8s.io/docs/concepts/topology_aware_scheduling) in Kueue. Topology Aware Scheduling is currently in alpha state since Kueue v0.9.
194
-
195
-
The following example uses `node.kubernetes.io/instance-type: "BM.GPU.H100.8"` to select H100s, but you can use any label that exists on all your nodes that you're targeting with the Resource Flavor.
`kueue.x-k8s.io/podset-preferred-topology` indicates that a PodSet requires Topology Aware Scheduling, but scheduling all pods within pods on nodes within the same topology domain is a preference rather than requirement. The levels are evaluated one-by-one going up from the level indicated by the annotation. If the PodSet cannot fit within a given topology domain then the next topology level up is considered. If the PodSet cannot fit at the highest topology level, then it gets admitted as distributed among multiple topology domains.
### Using Node Ordering script as an Init Container
283
373
If your workload can use an ordered hostfile or a rankfile (e.g. `mpirun`), you can use the [Node Ordering script](../docker/node-ordering/node_ordering.py) to generate the ordered hostfile/rankfile using an Init Container and then use the generated hostlist/rankfile in your job.
0 commit comments