Skip to content

Commit 3b3a4d6

Browse files
First commit
1 parent e316d3a commit 3b3a4d6

File tree

9 files changed

+350
-0
lines changed

9 files changed

+350
-0
lines changed
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Deploying Ollama and Open WebUI on OKE
2+
3+
In this tutorial, we will explain how to use famous Mistral models in a browser using Open WebUI. The LLM will be served using Ollama and the overal infrastructure will rely on an Oracle Kubernetes Engine with a NVIDIA A10 GPU node pool.
4+
5+
## Prerequisites
6+
7+
To run this tutorial, you will need:
8+
* An OCI tenancy with limits set for A10 based instances
9+
10+
## Deploying the infrastructure
11+
12+
### Creating the OKE cluster with a CPU node pool
13+
14+
The first step consists in creating a Kubernetes cluster. Initially, the cluster will be configured with a CPU node pool only. GPU node pool will be added afterwards.
15+
16+
![OKE Quick Create illustration](assets/images/oke-quick-create.png "OKE Quick Create")
17+
18+
The easiest way is to use the Quick Create cluster assistant with the following options:
19+
* Public Endpoint,
20+
* Self-Managed nodes,
21+
* Private workers,
22+
* VM.Standard.E5.Flex compute shapes,
23+
* Oracle Linux OKE specific image.
24+
25+
### Accessing the cluster
26+
27+
Click Access Cluster, choose Cloud Shell Access or Local Access and follow the instructions. If you select Local Access, you must first install and configure the OCI CLI package. Check that the nodes are there:
28+
```
29+
kubectl get nodes
30+
```
31+
32+
### Adding a GPU node pool
33+
34+
Once the cluster is available, we can add a GPU node pool. Go to `Node pools` on the left panel and click on `Add node pool` and use the following options:
35+
* Public endpoint,
36+
* Self-Managed Nodes,
37+
* VM.GPU.A10.1 nodes,
38+
* Oracle Linux GPU OKE image.
39+
* Specify a custom boot volume size of 250 GB and add an Initialization script (Advanced options) to apply the changes.
40+
```
41+
#!/bin/bash
42+
curl --fail -H "Authorization: Bearer Oracle" -L0 http://169.254.169.254/opc/v2/instance/metadata/oke_init_script | base64 --decode >/var/run/oke-init.sh
43+
bash /usr/libexec/oci-growfs -y
44+
bash /var/run/oke-init.sh
45+
```
46+
Click on Create to add the GPU instances and wait for the node pool to be `Active` and the nodes to be in the `Ready` state. Check again the nodes that are available:
47+
```
48+
kubectl get nodes
49+
```
50+
Check device visibility on the GPU node whose name is `xxx.xxx.xxx.xxx`:
51+
```
52+
kubectl describe nodes xxx.xxx.xxx.xxx | grep gpu
53+
```
54+
You will get the following output:
55+
```
56+
nvidia.com/gpu=true
57+
Taints: nvidia.com/gpu=present:NoSchedule
58+
nvidia.com/gpu: 1
59+
nvidia.com/gpu: 1
60+
kube-system nvidia-gpu-device-plugin-8ktcj 50m (0%) 50m (0%) 200Mi (0%) 200Mi (0%) 4m48s
61+
nvidia.com/gpu 0 0
62+
```
63+
(The following manifests are not tied to any GPU type.)
64+
65+
66+
### Installing the NVIDIA GPU Operator
67+
68+
You can access the cluster either using Cloud Shell or using a standalone instance. The NVIDIA GPU Operator enhances the GPU features visibility in Kubernetes. The easiest way to install it is to use `helm`.
69+
```
70+
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
71+
helm repo update
72+
helm install gpu-operator nvidia/gpu-operator --namespace gpu-operator --create-namespace
73+
```
74+
Check again the device visibility on the GPU node:
75+
```
76+
kubectl describe nodes xxx.xxx.xxx.xxx | grep gpu
77+
```
78+
You will get the following output:
79+
```
80+
nvidia.com/gpu=true
81+
nvidia.com/gpu-driver-upgrade-state=upgrade-done
82+
nvidia.com/gpu.compute.major=8
83+
nvidia.com/gpu.compute.minor=6
84+
nvidia.com/gpu.count=1
85+
nvidia.com/gpu.deploy.container-toolkit=true
86+
nvidia.com/gpu.deploy.dcgm=true
87+
nvidia.com/gpu.deploy.dcgm-exporter=true
88+
nvidia.com/gpu.deploy.device-plugin=true
89+
nvidia.com/gpu.deploy.driver=pre-installed
90+
nvidia.com/gpu.deploy.gpu-feature-discovery=true
91+
nvidia.com/gpu.deploy.node-status-exporter=true
92+
nvidia.com/gpu.deploy.operator-validator=true
93+
nvidia.com/gpu.family=ampere
94+
nvidia.com/gpu.machine=Standard-PC-i440FX-PIIX-1996
95+
nvidia.com/gpu.memory=23028
96+
nvidia.com/gpu.mode=compute
97+
nvidia.com/gpu.present=true
98+
nvidia.com/gpu.product=NVIDIA-A10
99+
nvidia.com/gpu.replicas=1
100+
nvidia.com/gpu.sharing-strategy=none
101+
nvidia.com/vgpu.present=false
102+
nvidia.com/gpu-driver-upgrade-enabled: true
103+
Taints: nvidia.com/gpu=present:NoSchedule
104+
nvidia.com/gpu: 1
105+
nvidia.com/gpu: 1
106+
gpu-operator gpu-feature-discovery-9jmph 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m1s
107+
gpu-operator gpu-operator-node-feature-discovery-worker-t6b75 5m (0%) 0 (0%) 64Mi (0%) 512Mi (0%) 3m16s
108+
gpu-operator nvidia-container-toolkit-daemonset-t5tpc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m3s
109+
gpu-operator nvidia-dcgm-exporter-2jvhz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m2s
110+
gpu-operator nvidia-device-plugin-daemonset-zbk2b 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m2s
111+
gpu-operator nvidia-operator-validator-wpkxt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3m3s
112+
kube-system nvidia-gpu-device-plugin-8ktcj 50m (0%) 50m (0%) 200Mi (0%) 200Mi (0%) 12m
113+
nvidia.com/gpu 0 0
114+
Normal GPUDriverUpgrade 2m52s nvidia-gpu-operator Successfully updated node state label to upgrade-done
115+
```
116+
117+
## Deploying Ollama
118+
119+
### Creating Ollama deployment
120+
121+
To deploy Ollama, simply use the `ollama-deployment.yml` manifest.
122+
```
123+
kubectl apply -f ollama-deployment.yaml
124+
```
125+
Check that the deployment is ready:
126+
```
127+
kubectl get all
128+
```
129+
130+
### Pulling the model from pod
131+
132+
Enter the container:
133+
```
134+
kubectl exec -ti ollama-deployment-pod -- /bin/bash
135+
```
136+
where `ollama-deployment-pod` is the name of the pod displayed by the `kubectl get pods` command.
137+
Check Ollama installation and pull desired model(s), here Mistral 7B version 0.3 from Mistral AI, simply referred to as `mistral`:
138+
```
139+
ollama --version (optional)
140+
ollama pull mistral
141+
```
142+
For more model options, the list of all supported models can be found in [here](https://ollama.com/search).
143+
144+
Optionnally, the model can be tested from the container:
145+
```
146+
ollama run mistral
147+
>>> Tell me about Mistral AI.
148+
Mistral AI is a cutting-edge company based in Paris, France, developing large language models. Founded by CTO Edouard Dumoulin and CEO Thibault Favodi in 2021, Mistral AI aims to create advanced artificial intelligence technologies that can understand, learn, and generate human-like text with a focus on French and European languages.
149+
150+
Mistral AI is backed by prominent European investors, including Daphni, Founders Future, and Iris Capital, among others, and has received significant financial support from the French government to further its research and development in large language models. The company's ultimate goal is to contribute to France's technological sovereignty and help shape the future of artificial
151+
intelligence on the European continent.
152+
153+
One of Mistral AI's most notable projects is "La Mesure," a large-scale French language model that has achieved impressive results in various natural language processing tasks, such as text generation and understanding. The company is dedicated to pushing the boundaries of what AI can do and applying its technology to real-world applications like education, entertainment, and more.
154+
155+
>>> /bye
156+
```
157+
Exit the container by simply typing `exit`.
158+
159+
160+
### Creating Ollama service
161+
162+
The Ollama service can be created using the `ollama-service.yaml` manifest:
163+
```
164+
kubectl apply -f ollama-service.yaml
165+
```
166+
167+
## Deploying Open WebUI
168+
169+
### Creating Open WebUI deployment
170+
171+
Open WebUI can be deployed using the `openwebui-deployment.yaml` manifest:
172+
```
173+
kubectl apply -f openwebui-deployment.yaml
174+
```
175+
176+
### Creating Open WebUI service
177+
178+
The Open WebUI service can be created using the `openwebui-service.yaml` manifest:
179+
```
180+
kubectl apply -f openwebui-service.yaml
181+
```
182+
183+
## Testing the platform
184+
185+
Check that everything is running:
186+
```
187+
kubectl get all
188+
```
189+
Go to `http://xxx.xxx.xxx.xxx:81` where xxx.xxx.xxx.xxx is the external IP address of the Open WebUI load balancer and click on `Get started` and create admin account (local).
190+
191+
If no model can be found, go to `Profile > Settings > Admin Settings > Connections > Manage Ollama API Connections`. Verify that the Ollama address matches the Ollama service load balancer external IP address and check the connection by clicking on the `Configure icon > Verify Connection`.
192+
193+
You can now start chatting with the model.
194+
195+
![Open WebUI workspace illustration](assets/images/open-webui-workspace.png "Open WebUI workspace")
196+
197+
## External links
198+
199+
* [Mistral AI official website](https://mistral.ai/)
200+
* [Ollama official website](https://ollama.com/)
201+
* [Open WebUI official website](https://openwebui.com/)
202+
203+
## License
204+
205+
Copyright (c) 2025 Oracle and/or its affiliates.
206+
207+
Licensed under the Universal Permissive License (UPL), Version 1.0.
208+
209+
See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
255 KB
Loading
457 KB
Loading
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
Copyright (c) 2024 Oracle and/or its affiliates.
2+
3+
The Universal Permissive License (UPL), Version 1.0
4+
5+
Subject to the condition set forth below, permission is hereby granted to any
6+
person obtaining a copy of this software, associated documentation and/or data
7+
(collectively the "Software"), free of charge and under any and all copyright
8+
rights in the Software, and any and all patent rights owned or freely
9+
licensable by each licensor hereunder covering either (i) the unmodified
10+
Software as contributed to or provided by such licensor, or (ii) the Larger
11+
Works (as defined below), to deal in both
12+
13+
(a) the Software, and
14+
(b) any piece of software and/or hardware listed in the lrgrwrks.txt file if
15+
one is included with the Software (each a "Larger Work" to which the Software
16+
is contributed by such licensors),
17+
18+
without restriction, including without limitation the rights to copy, create
19+
derivative works of, display, perform, and distribute the Software and make,
20+
use, sell, offer for sale, import, export, have made, and have sold the
21+
Software and the Larger Work(s), and to sublicense the foregoing rights on
22+
either these or other terms.
23+
24+
This license is subject to the following condition:
25+
The above copyright notice and either this complete permission notice or at
26+
a minimum a reference to the UPL must be included in all copies or
27+
substantial portions of the Software.
28+
29+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
SOFTWARE.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: ollama-deployment
5+
spec:
6+
replicas: 1
7+
selector:
8+
matchLabels:
9+
app: ollama
10+
template:
11+
metadata:
12+
labels:
13+
app: ollama
14+
spec:
15+
nodeSelector:
16+
nvidia.com/gpu.present: "true"
17+
containers:
18+
- name: ollama
19+
image: ollama/ollama:latest
20+
ports:
21+
- containerPort: 11434
22+
tolerations:
23+
- key: "nvidia.com/gpu"
24+
operator: "Exists"
25+
effect: "NoSchedule"
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: v1
2+
kind: Service
3+
metadata:
4+
name: ollama-service
5+
spec:
6+
type: LoadBalancer
7+
selector:
8+
app: ollama
9+
ports:
10+
- protocol: TCP
11+
port: 80
12+
targetPort: 11434
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: openwebui-deployment
5+
spec:
6+
replicas: 1
7+
selector:
8+
matchLabels:
9+
app: openwebui
10+
template:
11+
metadata:
12+
labels:
13+
app: openwebui
14+
spec:
15+
containers:
16+
- name: openwebui
17+
image: ghcr.io/open-webui/open-webui:main
18+
ports:
19+
- containerPort: 8080
20+
env:
21+
- name: OLLAMA_BASE_URL
22+
value: "http://129.159.241.81:80"
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: v1
2+
kind: Service
3+
metadata:
4+
name: openwebui-service
5+
spec:
6+
type: LoadBalancer
7+
selector:
8+
app: openwebui
9+
ports:
10+
- protocol: TCP
11+
port: 81
12+
targetPort: 8080

0 commit comments

Comments
 (0)