Skip to content

Commit 6516fff

Browse files
author
Claudia
committed
add autopilot
minor typo fixes Signed-off-by: Claudia <[email protected]> last typo Signed-off-by: Claudia <[email protected]>
1 parent dab1edb commit 6516fff

File tree

5 files changed

+140
-0
lines changed

5 files changed

+140
-0
lines changed

setup.RHOAI-v2.13/CLUSTER-SETUP.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,31 @@ kueue-controller-manager's log:
8888

8989
```
9090

91+
## Autopilot
92+
93+
Helm charts values and how-to for customization can be found [here](helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes.
94+
95+
- Add the Autopilot Helm repository
96+
97+
```bash
98+
helm repo add autopilot https://ibm.github.io/autopilot/
99+
```
100+
101+
- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional.
102+
103+
```bash
104+
helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml
105+
```
106+
107+
### Enabling Prometheus metrics
108+
109+
After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command:
110+
The `ServiceMonitor` labeling is not required.
111+
112+
```bash
113+
oc label ns autopilot openshift.io/cluster-monitoring=true
114+
```
115+
91116
## Kueue Configuration
92117

93118
Create Kueue's default flavor:

setup.RHOAI-v2.16/CLUSTER-SETUP.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,31 @@ AI configuration as follows:
7676

7777

7878

79+
## Autopilot
80+
81+
Helm charts values and how-to for customization can be found [here](helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes.
82+
83+
- Add the Autopilot Helm repository
84+
85+
```bash
86+
helm repo add autopilot https://ibm.github.io/autopilot/
87+
```
88+
89+
- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional.
90+
91+
```bash
92+
helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml
93+
```
94+
95+
### Enabling Prometheus metrics
96+
97+
After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command:
98+
The `ServiceMonitor` labeling is not required.
99+
100+
```bash
101+
oc label ns autopilot openshift.io/cluster-monitoring=true
102+
```
103+
79104
## Kueue Configuration
80105

81106
Create Kueue's default flavor:

setup.RHOAI-v2.17/CLUSTER-SETUP.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,31 @@ AI configuration as follows:
7676

7777

7878

79+
## Autopilot
80+
81+
Helm charts values and how-to for customization can be found [here](helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes.
82+
83+
- Add the Autopilot Helm repository
84+
85+
```bash
86+
helm repo add autopilot https://ibm.github.io/autopilot/
87+
```
88+
89+
- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional.
90+
91+
```bash
92+
helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml
93+
```
94+
95+
### Enabling Prometheus metrics
96+
97+
After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command:
98+
The `ServiceMonitor` labeling is not required.
99+
100+
```bash
101+
oc label ns autopilot openshift.io/cluster-monitoring=true
102+
```
103+
79104
## Kueue Configuration
80105

81106
Create Kueue's default flavor:

setup.k8s/CLUSTER-SETUP.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ The cluster setup installs and configures the following components:
77
+ Kueue
88
+ AppWrappers
99
+ Cluster roles and priority classes
10+
+ Autopilot
1011

1112
## Priorities
1213

@@ -73,6 +74,33 @@ operators as follows:
7374
- `queueName` is set to `default-queue`,
7475
- pod priorities, resource requests and limits have been adjusted.
7576

77+
## Autopilot
78+
79+
Helm charts values and how-to for customization can be found [here](helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes.
80+
81+
- Add the Autopilot Helm repository
82+
83+
```bash
84+
helm repo add autopilot https://ibm.github.io/autopilot/
85+
```
86+
87+
- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional.
88+
89+
```bash
90+
helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml
91+
```
92+
93+
### Enabling Prometheus metrics
94+
95+
The `ServiceMonitor` object is the one that enables Prometheus to scrape the metrics produced by Autopilot.
96+
In order for Prometheus to find the right objects, the `ServiceMonitor` needs to be annotated with the Prometheus' release name. It is usually `prometheus`, and that's the default added in the Autopilot release.
97+
If that is not the case in your cluster, the correct release label can be found by checking in the `ServiceMonitor` of Prometheus itself, or the name of Prometheus helm chart.
98+
Then, Autopilot's `ServiceMonitor` can be labeled with the following command
99+
100+
```bash
101+
kubectl label servicemonitors.monitoring.coreos.com -n autopilot autopilot-metrics-monitor release=<prometheus-release-name>
102+
```
103+
76104
## Kueue Configuration
77105

78106
Create Kueue's default flavor:

setup.tmpl/CLUSTER-SETUP.md.tmpl

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ The cluster setup installs and configures the following components:
1212
+ Kueue
1313
+ AppWrappers
1414
+ Cluster roles and priority classes
15+
+ Autopilot
1516

1617
{{- end }}
1718

@@ -154,6 +155,42 @@ operators as follows:
154155

155156
{{- end }}
156157

158+
## Autopilot
159+
160+
Helm charts values and how-to for customization can be found [here](helm-charts/autopilot/README.md). As-is, Autopilot will run on GPU nodes.
161+
162+
- Add the Autopilot Helm repository
163+
164+
```bash
165+
helm repo add autopilot https://ibm.github.io/autopilot/
166+
```
167+
168+
- Install the chart (idempotent command). The config file is for customizing the helm values and it is optional.
169+
170+
```bash
171+
helm upgrade autopilot autopilot/autopilot --install --namespace=autopilot --create-namespace -f your-config.yml
172+
```
173+
174+
### Enabling Prometheus metrics
175+
176+
{{ if .OPENSHIFT -}}
177+
After completing the installation, manually label the namespace to enable metrics to be scraped by Prometheus with the following command:
178+
The `ServiceMonitor` labeling is not required.
179+
180+
```bash
181+
{{ .KUBECTL }} label ns autopilot openshift.io/cluster-monitoring=true
182+
```
183+
{{- else -}}
184+
The `ServiceMonitor` object is the one that enables Prometheus to scrape the metrics produced by Autopilot.
185+
In order for Prometheus to find the right objects, the `ServiceMonitor` needs to be annotated with the Prometheus' release name. It is usually `prometheus`, and that's the default added in the Autopilot release.
186+
If that is not the case in your cluster, the correct release label can be found by checking in the `ServiceMonitor` of Prometheus itself, or the name of Prometheus helm chart.
187+
Then, Autopilot's `ServiceMonitor` can be labeled with the following command
188+
189+
```bash
190+
{{ .KUBECTL }} label servicemonitors.monitoring.coreos.com -n autopilot autopilot-metrics-monitor release=<prometheus-release-name>
191+
```
192+
{{- end }}
193+
157194
## Kueue Configuration
158195

159196
Create Kueue's default flavor:

0 commit comments

Comments
 (0)