You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/1-deploy-amd64.md
+33-32Lines changed: 33 additions & 32 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,14 +6,11 @@ weight: 3
6
6
layout: learningpathall
7
7
---
8
8
9
-
## Overview
9
+
In this section, you'll bootstrap the cluster with Ollama on amd64, to simulate an "existing" K8s cluster running Ollama. In the next section you will add arm64 nodes alongside the amd64 nodes so you can compare them.
10
10
11
-
Any easy way to experiment with Arm64 nodes in your K8s cluster is to deploy Arm64 nodes and pods alongside your existing amd64 node and pods. In this section of the tutorial, you'll bootstrap the cluster with Ollama on amd64, to simulate an "existing" K8s cluster running Ollama.
11
+
### Deployment and service
12
12
13
-
### Deployment and Service
14
-
15
-
16
-
1. Copy the following YAML, and save it to a file called *namespace.yaml*:
13
+
1. Use a text editor to copy the following YAML and save it to a file called `namespace.yaml`:
17
14
18
15
```yaml
19
16
apiVersion: v1
@@ -22,9 +19,9 @@ metadata:
22
19
name: ollama
23
20
```
24
21
25
-
When the above is applied, a new K8s namespace named *ollama* will be created. This is where all the K8s object created under this tutorial will live.
22
+
When the above is applied, a new K8s namespace named `ollama` is created. This is where all the K8s objects will live.
26
23
27
-
2. Copy the following YAML, and save it to a file called *amd64_ollama.yaml*:
24
+
2. Use a text editor to copy the following YAML and save it to a file called `amd64_ollama.yaml`:
28
25
29
26
```yaml
30
27
apiVersion: apps/v1
@@ -80,45 +77,46 @@ spec:
80
77
81
78
When the above is applied:
82
79
83
-
* A new Deployment called *ollama-amd64-deployment* is created. This deployment pulls a multi-architectural (both amd64 and arm64) image [ollama image from Dockerhub](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63).
80
+
* A new Deployment called `ollama-amd64-deployment` is created. This deployment pulls a multi-architecture [Ollama image](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63) from DockerHub.
84
81
85
-
Of particular interest is the *nodeSelector* *kubernetes.io/arch*, with the value of *amd64*. This will ensure that this deployment only runs on amd64-based nodes, utilizing the amd64 version of the Ollama container image.
82
+
Of particular interest is the `nodeSelector` `kubernetes.io/arch`, with the value of `amd64`. This ensures that the deployment only runs on amd64 nodes, utilizing the amd64 version of the Ollama container image.
86
83
87
-
* A new load balancer Service *ollama-amd64-svc* is created, which targets all pods with the *arch: amd64* label (our amd64 deployment creates these pods.)
84
+
* A new load balancer Service `ollama-amd64-svc` is created, which targets all pods with the `arch: amd64` label (the amd64 deployment creates these pods).
88
85
89
-
A *sessionAffinity* tag was added to this Service to remove sticky connections to the target pods; this removes persistent connections to the same pod on each request.
86
+
A `sessionAffinity` tag is added to this Service to remove sticky connections to the target pods. This removes persistent connections to the same pod on each request.
90
87
91
-
### Apply the amd64 Deployment and Service
88
+
### Apply the amd64 deployment and service
92
89
93
-
1. Run the following command to apply the namespace, deployment, and service definitions:
90
+
1. Run the following commands to apply the namespace, deployment, and service definitions:
94
91
95
92
```bash
96
93
kubectl apply -f namespace.yaml
97
94
kubectl apply -f amd64_ollama.yaml
98
95
```
99
96
100
-
You should get the following responses back:
97
+
You see the following responses:
101
98
102
-
```bash
99
+
```output
103
100
namespace/ollama created
104
101
deployment.apps/ollama-amd64-deployment created
105
102
service/ollama-amd64-svc created
106
103
```
107
-
2. Optionally, set the *default Namespace* to *ollama* so you don't need to specify the namespace each time, by entering the following:
104
+
105
+
2. Optionally, set the `default Namespace` to `ollama` so you don't need to specify the namespace each time, by entering the following:
108
106
109
107
```bash
110
108
config set-context --current --namespace=ollama
111
109
```
112
110
113
-
3. Get the status of the pods, and the services, by running the following:
111
+
3. Get the status of the pods and the services by running the following:
114
112
115
-
```commandline
113
+
```bash
116
114
kubectl get nodes,pods,svc -nollama
117
115
```
118
116
119
-
Your output should be similar to the following, showing one node, one pod, and one service:
117
+
Your output is similar to the following, showing one node, one pod, and one service:
When the pods show *Running* and the service shows a valid *External IP*, we're ready to test the Ollama amd64 service!
130
+
When the pods show `Running` and the service shows a valid `External IP`, you are ready to test the Ollama amd64 service!
133
131
134
-
### Test the Ollama on amd64 web service
132
+
### Test the Ollama web service on amd64
135
133
136
134
{{% notice Note %}}
137
-
The following utility, modelUtil.sh, is provided as a convenient utility to accompany this learning path. It's simply a shell wrapper for kubectl, utilizing the utilities [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html). Make sure you have these shell utilities installed before running.
138
-
{{% /notice %}}
135
+
The following utility `modelUtil.sh` is provided for convenience.
139
136
137
+
It's a wrapper for kubectl, utilizing the utilities [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html).
140
138
141
-
4. Copy the following shell script, and save it to a file called *model_util.sh*:
139
+
Make sure you have these shell utilities installed before running.
140
+
{{% /notice %}}
141
+
142
+
4. Use a text editor to copy the following shell script and save it to a file called `model_util.sh`:
5. Make the script executable with the following command:
237
238
238
239
```bash
239
240
chmod 755 model_util.sh
240
241
```
241
242
242
-
This shell script conveniently bundles many test and logging commands into a single place, making it easy to test, troubleshoot, and view the services we expose in this tutorial.
243
+
The script conveniently bundles many test and logging commands into a single place, making it easy to test, troubleshoot, and view services.
243
244
244
245
6. Run the following to make an HTTP request to the amd64 Ollama service on port 80:
245
246
246
247
```commandline
247
248
./model_util.sh amd64 hello
248
249
```
249
250
250
-
You should get back the HTTP response, as well as the logline from the pod that served it:
251
+
You get back the HTTP response, as well as the logline from the pod that served it:
251
252
252
-
```commandline
253
+
```output
253
254
Server response:
254
255
Using service endpoint 34.55.25.101 for hello on amd64
Success is defined specifically by seeing the words "Ollama is running". If you see this in your output, then congrats, you've successfully bootstrapped your GKE cluster with an amd64 node, running a Deployment with the Ollama multi-architecture container instance!
263
+
If you see the output `Ollama is running` you have successfully bootstrapped your GKE cluster with an amd64 node, running a deployment with the Ollama multi-architecture container instance!
263
264
264
-
Next, we'll do the same thing, but with an Arm node.
265
+
Continue to the next section to do the same thing, but with an Arm node.
0 commit comments