@@ -16,24 +16,28 @@ Create `default-priority`, `high-priority`, and `low-priority` priority classes:
1616kubectl apply -f setup.k8s/mlbatch-priorities.yaml
1717```
1818
19- ## Scheduler Plugins
19+ ## Scheduler Configuration
20+
21+ MLBatch configures Kubernetes scheduling to accomplish two objectives:
22+ + Obtaining gang (all or nothing) scheduling for multi-Pod workloads.
23+ + Packing Pods whose GPU request is less than the number of GPUs on a Node to
24+ maximize the number of Nodes available for Pods that request all the GPUs on a Node.
25+
26+ The currently recommend way to do this is by installing the Coscheduling out-of-tree scheduler
27+ plugin and configuring the default NodeResourcesFit scheduler plugin to pack in the GPU dimension.
28+ Alternatively, you can skip the helm install and patch commands shown below and instead install
29+ the experimental Sakkara scheduler plugin (described next).
2030
21- MLBatch utilizes Kubernetes Scheduler Plugins to ensure gang scheduling of
22- multi-Pod workloads and to pack ` Pods ` onto ` Nodes ` to reduce GPU fragmentation.
23- Two options are described below: Coscheduler and Sakkara. You should pick and install one of them
24- as a secondary scheduler for your cluster.
25- ### Coscheduler
2631
27- Install Coscheduler v0.28.9 as a secondary scheduler and configure packing:
2832``` sh
2933helm install scheduler-plugins --namespace scheduler-plugins --create-namespace \
3034 scheduler-plugins/manifests/install/charts/as-a-second-scheduler/ \
3135 --set-json pluginConfig=' [{"args":{"scoringStrategy":{"resources":[{"name":"nvidia.com/gpu","weight":1}],"requestedToCapacityRatio":{"shape":[{"utilization":0,"score":0},{"utilization":100,"score":10}]},"type":"RequestedToCapacityRatio"}},"name":"NodeResourcesFit"},{"args":{"permitWaitingTimeSeconds":300},"name":"Coscheduling"}]'
3236```
33- Patch Coscheduler pod priorities:
37+ Patch scheduler-plugins pod priorities:
3438``` sh
35- kubectl patch deployment -n scheduler-plugins --type=json --patch-file setup.k8s/coscheduler -priority-patch.yaml scheduler-plugins-controller
36- kubectl patch deployment -n scheduler-plugins --type=json --patch-file setup.k8s/coscheduler -priority-patch.yaml scheduler-plugins-scheduler
39+ kubectl patch deployment -n scheduler-plugins --type=json --patch-file setup.k8s/scheduler -priority-patch.yaml scheduler-plugins-controller
40+ kubectl patch deployment -n scheduler-plugins --type=json --patch-file setup.k8s/scheduler -priority-patch.yaml scheduler-plugins-scheduler
3741```
3842
3943### Sakkara
@@ -56,9 +60,9 @@ kubectl create namespace mlbatch-system
5660
5761Install the Kubeflow Training Operator
5862
59- If you are using Coscheduler do:
63+ If you are using Coscheduling do:
6064``` sh
61- kubectl apply --server-side -k setup.k8s/training-operator/coscheduler
65+ kubectl apply --server-side -k setup.k8s/training-operator/coscheduling
6266```
6367If you are using Sakkara do:
6468``` sh
@@ -76,9 +80,9 @@ kubectl apply --server-side -k setup.k8s/kueue
7680```
7781
7882Install the AppWrapper Operator
79- If you are using Coscheduler do:
83+ If you are using Coscheduling do:
8084``` sh
81- kubectl apply --server-side -k setup.k8s/appwrapper/coscheduler
85+ kubectl apply --server-side -k setup.k8s/appwrapper/coscheduling
8286```
8387If you are using Sakkara do:
8488``` sh
0 commit comments