Skip to content

Commit e88ae75

Browse files
Merge pull request #1750 from jasonrandrews/review
Reviewing Ollama on GKE
2 parents 45a55ea + 1280ec6 commit e88ae75

File tree

1 file changed

+33
-32
lines changed
  • content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke

1 file changed

+33
-32
lines changed

content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/1-deploy-amd64.md

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,11 @@ weight: 3
66
layout: learningpathall
77
---
88

9-
## Overview
9+
In this section, you'll bootstrap the cluster with Ollama on amd64, to simulate an "existing" K8s cluster running Ollama. In the next section you will add arm64 nodes alongside the amd64 nodes so you can compare them.
1010

11-
Any easy way to experiment with Arm64 nodes in your K8s cluster is to deploy Arm64 nodes and pods alongside your existing amd64 node and pods. In this section of the tutorial, you'll bootstrap the cluster with Ollama on amd64, to simulate an "existing" K8s cluster running Ollama.
11+
### Deployment and service
1212

13-
### Deployment and Service
14-
15-
16-
1. Copy the following YAML, and save it to a file called *namespace.yaml*:
13+
1. Use a text editor to copy the following YAML and save it to a file called `namespace.yaml`:
1714

1815
```yaml
1916
apiVersion: v1
@@ -22,9 +19,9 @@ metadata:
2219
name: ollama
2320
```
2421
25-
When the above is applied, a new K8s namespace named *ollama* will be created. This is where all the K8s object created under this tutorial will live.
22+
When the above is applied, a new K8s namespace named `ollama` is created. This is where all the K8s objects will live.
2623

27-
2. Copy the following YAML, and save it to a file called *amd64_ollama.yaml*:
24+
2. Use a text editor to copy the following YAML and save it to a file called `amd64_ollama.yaml`:
2825

2926
```yaml
3027
apiVersion: apps/v1
@@ -80,45 +77,46 @@ spec:
8077

8178
When the above is applied:
8279

83-
* A new Deployment called *ollama-amd64-deployment* is created. This deployment pulls a multi-architectural (both amd64 and arm64) image [ollama image from Dockerhub](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63).
80+
* A new Deployment called `ollama-amd64-deployment` is created. This deployment pulls a multi-architecture [Ollama image](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63) from DockerHub.
8481

85-
Of particular interest is the *nodeSelector* *kubernetes.io/arch*, with the value of *amd64*. This will ensure that this deployment only runs on amd64-based nodes, utilizing the amd64 version of the Ollama container image.
82+
Of particular interest is the `nodeSelector` `kubernetes.io/arch`, with the value of `amd64`. This ensures that the deployment only runs on amd64 nodes, utilizing the amd64 version of the Ollama container image.
8683

87-
* A new load balancer Service *ollama-amd64-svc* is created, which targets all pods with the *arch: amd64* label (our amd64 deployment creates these pods.)
84+
* A new load balancer Service `ollama-amd64-svc` is created, which targets all pods with the `arch: amd64` label (the amd64 deployment creates these pods).
8885

89-
A *sessionAffinity* tag was added to this Service to remove sticky connections to the target pods; this removes persistent connections to the same pod on each request.
86+
A `sessionAffinity` tag is added to this Service to remove sticky connections to the target pods. This removes persistent connections to the same pod on each request.
9087

91-
### Apply the amd64 Deployment and Service
88+
### Apply the amd64 deployment and service
9289

93-
1. Run the following command to apply the namespace, deployment, and service definitions:
90+
1. Run the following commands to apply the namespace, deployment, and service definitions:
9491

9592
```bash
9693
kubectl apply -f namespace.yaml
9794
kubectl apply -f amd64_ollama.yaml
9895
```
9996

100-
You should get the following responses back:
97+
You see the following responses:
10198

102-
```bash
99+
```output
103100
namespace/ollama created
104101
deployment.apps/ollama-amd64-deployment created
105102
service/ollama-amd64-svc created
106103
```
107-
2. Optionally, set the *default Namespace* to *ollama* so you don't need to specify the namespace each time, by entering the following:
104+
105+
2. Optionally, set the `default Namespace` to `ollama` so you don't need to specify the namespace each time, by entering the following:
108106

109107
```bash
110108
config set-context --current --namespace=ollama
111109
```
112110

113-
3. Get the status of the pods, and the services, by running the following:
111+
3. Get the status of the pods and the services by running the following:
114112

115-
```commandline
113+
```bash
116114
kubectl get nodes,pods,svc -nollama
117115
```
118116

119-
Your output should be similar to the following, showing one node, one pod, and one service:
117+
Your output is similar to the following, showing one node, one pod, and one service:
120118

121-
```commandline
119+
```output
122120
NAME STATUS ROLES AGE VERSION
123121
node/gke-ollama-on-arm-amd64-pool-62c0835c-93ht Ready <none> 77m v1.31.6-gke.1020000
124122
@@ -129,16 +127,19 @@ NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
129127
service/ollama-amd64-svc LoadBalancer 1.2.2.3 1.2.3.4 80:30668/TCP 16m
130128
```
131129

132-
When the pods show *Running* and the service shows a valid *External IP*, we're ready to test the Ollama amd64 service!
130+
When the pods show `Running` and the service shows a valid `External IP`, you are ready to test the Ollama amd64 service!
133131

134-
### Test the Ollama on amd64 web service
132+
### Test the Ollama web service on amd64
135133

136134
{{% notice Note %}}
137-
The following utility, modelUtil.sh, is provided as a convenient utility to accompany this learning path. It's simply a shell wrapper for kubectl, utilizing the utilities [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html). Make sure you have these shell utilities installed before running.
138-
{{% /notice %}}
135+
The following utility `modelUtil.sh` is provided for convenience.
139136

137+
It's a wrapper for kubectl, utilizing the utilities [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html).
140138

141-
4. Copy the following shell script, and save it to a file called *model_util.sh*:
139+
Make sure you have these shell utilities installed before running.
140+
{{% /notice %}}
141+
142+
4. Use a text editor to copy the following shell script and save it to a file called `model_util.sh`:
142143

143144
```bash
144145
#!/bin/bash
@@ -233,23 +234,23 @@ echo;kubectl logs --timestamps -l app=ollama-multiarch -nollama --prefix | sor
233234
echo
234235
```
235236

236-
5. Make it executable with the following command:
237+
5. Make the script executable with the following command:
237238

238239
```bash
239240
chmod 755 model_util.sh
240241
```
241242

242-
This shell script conveniently bundles many test and logging commands into a single place, making it easy to test, troubleshoot, and view the services we expose in this tutorial.
243+
The script conveniently bundles many test and logging commands into a single place, making it easy to test, troubleshoot, and view services.
243244

244245
6. Run the following to make an HTTP request to the amd64 Ollama service on port 80:
245246

246247
```commandline
247248
./model_util.sh amd64 hello
248249
```
249250

250-
You should get back the HTTP response, as well as the logline from the pod that served it:
251+
You get back the HTTP response, as well as the logline from the pod that served it:
251252

252-
```commandline
253+
```output
253254
Server response:
254255
Using service endpoint 34.55.25.101 for hello on amd64
255256
Ollama is running
@@ -259,6 +260,6 @@ Pod log output:
259260
[pod/ollama-amd64-deployment-cbfc4b865-msftf/ollama-multiarch] 2025-03-25T21:13:49.022522588Z
260261
```
261262

262-
Success is defined specifically by seeing the words "Ollama is running". If you see this in your output, then congrats, you've successfully bootstrapped your GKE cluster with an amd64 node, running a Deployment with the Ollama multi-architecture container instance!
263+
If you see the output `Ollama is running` you have successfully bootstrapped your GKE cluster with an amd64 node, running a deployment with the Ollama multi-architecture container instance!
263264

264-
Next, we'll do the same thing, but with an Arm node.
265+
Continue to the next section to do the same thing, but with an Arm node.

0 commit comments

Comments
 (0)