Skip to content

Commit e3924cf

Browse files
Updated main readme and pictures
1 parent 3b3a4d6 commit e3924cf

File tree

4 files changed

+23
-17
lines changed

4 files changed

+23
-17
lines changed

cloud-infrastructure/ai-infra-gpu/ai-infrastructure/ollama-openwebui-mistral/README.md

Lines changed: 22 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Deploying Ollama and Open WebUI on OKE
22

3-
In this tutorial, we will explain how to use famous Mistral models in a browser using Open WebUI. The LLM will be served using Ollama and the overal infrastructure will rely on an Oracle Kubernetes Engine with a NVIDIA A10 GPU node pool.
3+
In this tutorial, we will explain how to use a Mistral AI large language model (LLM) in a browser using the Open WebUI graphical interface. The LLM will be served using the Ollama framework and the overall infrastructure will rely on an Oracle Kubernetes Engine cluster with a NVIDIA A10 GPU based node pool.
44

55
## Prerequisites
66

@@ -24,7 +24,7 @@ The easiest way is to use the Quick Create cluster assistant with the following
2424

2525
### Accessing the cluster
2626

27-
Click Access Cluster, choose Cloud Shell Access or Local Access and follow the instructions. If you select Local Access, you must first install and configure the OCI CLI package. Check that the nodes are there:
27+
Click Access Cluster, choose Cloud Shell Access or Local Access and follow the instructions. If you select Local Access, you must first install and configure the [OCI CLI package](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/cliconcepts.htm). We can now check that the nodes are there:
2828
```
2929
kubectl get nodes
3030
```
@@ -60,12 +60,10 @@ Taints: nvidia.com/gpu=present:NoSchedule
6060
kube-system nvidia-gpu-device-plugin-8ktcj 50m (0%) 50m (0%) 200Mi (0%) 200Mi (0%) 4m48s
6161
nvidia.com/gpu 0 0
6262
```
63-
(The following manifests are not tied to any GPU type.)
64-
6563

6664
### Installing the NVIDIA GPU Operator
6765

68-
You can access the cluster either using Cloud Shell or using a standalone instance. The NVIDIA GPU Operator enhances the GPU features visibility in Kubernetes. The easiest way to install it is to use `helm`.
66+
You can access the cluster either using Cloud Shell or using a standalone instance. The NVIDIA GPU Operator enhances the GPU features visibility in Kubernetes. The easiest way to install it is to use `Helm` ([Installing Helm](https://helm.sh/docs/intro/install/)).
6967
```
7068
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
7169
helm repo update
@@ -118,7 +116,7 @@ Taints: nvidia.com/gpu=present:NoSchedule
118116

119117
### Creating Ollama deployment
120118

121-
To deploy Ollama, simply use the `ollama-deployment.yml` manifest.
119+
[Ollama](https://ollama.com/) is an open source framework for deploying and training language models on a local machine such as a cloud instance. To deploy Ollama, simply use the `ollama-deployment.yml` manifest.
122120
```
123121
kubectl apply -f ollama-deployment.yaml
124122
```
@@ -129,14 +127,13 @@ kubectl get all
129127

130128
### Pulling the model from pod
131129

132-
Enter the container:
130+
The `ollama` image does not come with any models. Therefore, it is necessary to download it manually. Enter the container:
133131
```
134132
kubectl exec -ti ollama-deployment-pod -- /bin/bash
135133
```
136134
where `ollama-deployment-pod` is the name of the pod displayed by the `kubectl get pods` command.
137-
Check Ollama installation and pull desired model(s), here Mistral 7B version 0.3 from Mistral AI, simply referred to as `mistral`:
135+
Pull the desired model(s), here Mistral 7B version 0.3, simply referred to as `mistral`:
138136
```
139-
ollama --version (optional)
140137
ollama pull mistral
141138
```
142139
For more model options, the list of all supported models can be found in [here](https://ollama.com/search).
@@ -157,9 +154,9 @@ One of Mistral AI's most notable projects is "La Mesure," a large-scale French l
157154
Exit the container by simply typing `exit`.
158155

159156

160-
### Creating Ollama service
157+
### Creating an Ollama service
161158

162-
The Ollama service can be created using the `ollama-service.yaml` manifest:
159+
A Service is necessary to make the model accessible from outside of the node. The Ollama (load balancer with a public IP address) service can be created using the `ollama-service.yaml` manifest:
163160
```
164161
kubectl apply -f ollama-service.yaml
165162
```
@@ -168,32 +165,41 @@ kubectl apply -f ollama-service.yaml
168165

169166
### Creating Open WebUI deployment
170167

171-
Open WebUI can be deployed using the `openwebui-deployment.yaml` manifest:
168+
Open WebUI is a user-friendly self-hosted AI platform that supports multiple LLM runners including Ollama. It can be deployed using the `openwebui-deployment.yaml` manifest. First set the `OLLAMA_BASE_URL` value in the manifest and apply it:
172169
```
173170
kubectl apply -f openwebui-deployment.yaml
174171
```
175172

176173
### Creating Open WebUI service
177174

178-
The Open WebUI service can be created using the `openwebui-service.yaml` manifest:
175+
Like Ollama, OpenWebUI requires a Service (load balancer with a public IP address) to be reached. The Open WebUI service can be created using the `openwebui-service.yaml` manifest:
179176
```
180177
kubectl apply -f openwebui-service.yaml
181178
```
182179

183180
## Testing the platform
184181

185-
Check that everything is running:
182+
An easy way to check that everything is running is to run the following command:
186183
```
187184
kubectl get all
188185
```
189-
Go to `http://xxx.xxx.xxx.xxx:81` where xxx.xxx.xxx.xxx is the external IP address of the Open WebUI load balancer and click on `Get started` and create admin account (local).
186+
Go to `http://XXX.XXX.XXX.XXX:81` where XXX.XXX.XXX.XXX is the external IP address of the Open WebUI load balancer and click on `Get started` and create admin account (local).
190187

191-
If no model can be found, go to `Profile > Settings > Admin Settings > Connections > Manage Ollama API Connections`. Verify that the Ollama address matches the Ollama service load balancer external IP address and check the connection by clicking on the `Configure icon > Verify Connection`.
188+
If no model can be found, go to `Profile > Settings > Admin Settings > Connections > Manage Ollama API Connections` and verify that the Ollama address matches the Ollama service load balancer external IP address and check the connection by clicking on the `Configure icon > Verify Connection`.
192189

193190
You can now start chatting with the model.
194191

195192
![Open WebUI workspace illustration](assets/images/open-webui-workspace.png "Open WebUI workspace")
196193

194+
## Deleting the platform
195+
196+
If you want to delete all the platform, first delete all the resources deployed in the OKE cluster:
197+
```
198+
kubectl delete all --all
199+
```
200+
Then, the OKE cluster can be deleted from the OCI console.
201+
202+
197203
## External links
198204

199205
* [Mistral AI official website](https://mistral.ai/)
-20.6 KB
Loading
-307 KB
Loading

cloud-infrastructure/ai-infra-gpu/ai-infrastructure/ollama-openwebui-mistral/assets/scripts/openwebui-deployment.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ spec:
1919
- containerPort: 8080
2020
env:
2121
- name: OLLAMA_BASE_URL
22-
value: "http://129.159.241.81:80"
22+
value: "http://XXX.XXX.XXX.XXX:80"

0 commit comments

Comments
 (0)