ArmDeveloperEcosystem
diff --git a/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/0-spin_up_gke_cluster.md‎
Lines changed: 10 additions & 6 deletions b/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/0-spin_up_gke_cluster.md‎
Lines changed: 10 additions & 6 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/1-deploy-amd64.md‎
Lines changed: 6 additions & 6 deletions b/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/1-deploy-amd64.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/2-deploy-arm64.md‎
Lines changed: 32 additions & 29 deletions b/‎content/learning-paths/servers-and-cloud-computing/multiarch_ollama_on_gke/2-deploy-arm64.md‎
Lines changed: 32 additions & 29 deletions
@@ -8,11 +8,11 @@ layout: learningpathall
 
 ## Project overview
 
-Arm CPUs are widely used in traditional AI/ML use cases. In this Learning Path, you learn how to run [Ollama](https://ollama.com/) on Arm-based CPUs in a hybrid architecture (amd64 and arm64) K8s cluster.
+Arm CPUs are widely used in AI/ML use cases. In this Learning Path, you will learn how to run [Ollama](https://ollama.com/) on Arm-based CPUs in a hybrid architecture (amd64 and arm64) K8s cluster.
 
 To demonstrate this, you can bring up an initial Kubernetes cluster (depicted as "*1. Initial Cluster (amd64)*" in the image below) with an amd64 node running an Ollama Deployment and Service.
 
-Next, as depicted by "*2. Hybrid Cluster amd64/arm64*", you'll add the arm64 node, and apply an arm64 Deployment and Service to it, so that you can now test both architectures together, and separately, to investigate performance. 
+Next, as depicted by "*2. Hybrid Cluster amd64/arm64*", you'll add the arm64 node, and apply an arm64 deployment and service to it, so that you can now test both architectures together, and separately, to investigate performance. 
 
 When you are satisfied with the arm64 performance over amd64, its easy to delete the amd64-specific node, deployment, and service, to complete the migration, as depicted in "*3. Migrated Cluster (arm64)*".
 
@@ -52,14 +52,14 @@ Although this will work in all regions and zones where C4 and C4a instance types
 10. For *Machine Type*, select *c4-standard-4*
 
 {{% notice Note %}}
-The chosen node types support only one pod per node. If you wish to run multiple pods per node, assume each node should provide about 10GB memory per pod. 
+The chosen node types support only one pod per node. If you wish to run multiple pods per node, each node should provide about 10GB memory per pod. 
 {{% /notice %}}
 
 ![Configure amd64 node type](images/configure-x86-note-type.png)
 
 11. *Click* the *Create* button at the bottom of the screen.
 
-It will take a few moments, but when the green checkmark is showing next to the *ollama-on-multiarch* cluster, you're ready to continue to test your connection to the cluster.
+It will take a few moments, but when the green checkmark is showing next to the `ollama-on-multiarch` cluster, you're ready to continue to test your connection to the cluster.
 
 ### Connect to the cluster
 
@@ -75,19 +75,23 @@ export CLUSTER_NAME=ollama-on-multiarch
 export PROJECT_ID=YOUR_PROJECT_ID
 gcloud container clusters get-credentials $CLUSTER_NAME --zone $ZONE --project $PROJECT_ID
 ```
+
 If you get the message:
 
-```commandline
+```output
 CRITICAL: ACTION REQUIRED: gke-gcloud-auth-plugin, which is needed for continued use of kubectl, was not found or is not executable. Install gke-gcloud-auth-plugin for use with kubectl by following https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl#install_plugin
 ```
+
 This command should help resolve it:
 
 ```bash
 gcloud components install gke-gcloud-auth-plugin
 ```
+
 Finally, test the connection to the cluster with this command:
 
 ```commandline
 kubectl cluster-info
 ```
-If you receive a non-error response, you're successfully connected to the k8s cluster!
+
+If you receive a non-error response, you're successfully connected to the K8s cluster.
@@ -77,13 +77,13 @@ spec:
 
 When the above is applied:
 
-* A new Deployment called `ollama-amd64-deployment` is created.  This deployment pulls a multi-architecture [Ollama image](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63) from DockerHub. 
+* A new deployment called `ollama-amd64-deployment` is created.  This deployment pulls a multi-architecture [Ollama image](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63) from DockerHub. 
 
 Of particular interest is the `nodeSelector` `kubernetes.io/arch`, with the value of `amd64`.  This ensures that the deployment only runs on amd64 nodes, utilizing the amd64 version of the Ollama container image. 
 
-* A new load balancer Service `ollama-amd64-svc` is created, which targets all pods with the `arch: amd64` label (the amd64 deployment creates these pods).
+* A new load balancer service `ollama-amd64-svc` is created, which targets all pods with the `arch: amd64` label (the amd64 deployment creates these pods).
 
-A `sessionAffinity` tag is added to this Service to remove sticky connections to the target pods. This removes persistent connections to the same pod on each request.
+A `sessionAffinity` tag is added to this service to remove sticky connections to the target pods. This removes persistent connections to the same pod on each request.
 
 ### Apply the amd64 deployment and service
 
@@ -134,7 +134,7 @@ When the pods show `Running` and the service shows a valid `External IP`, you ar
 {{% notice Note %}}
 The following utility `modelUtil.sh` is provided for convenience. 
 
-It's a wrapper for kubectl, utilizing the utilities [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html).  
+It's a wrapper for kubectl, utilizing [curl](https://curl.se/), [jq](https://jqlang.org/), [bc](https://www.gnu.org/software/bc/), and [stdbuf](https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html).  
 
 Make sure you have these shell utilities installed before running.
 {{% /notice %}}
@@ -248,7 +248,7 @@ The script conveniently bundles many test and logging commands into a single pla
 ./model_util.sh amd64 hello
 ```
 
-You get back the HTTP response, as well as the logline from the pod that served it:
+You get back the HTTP response, as well as the log line from the pod that served it:
 
 ```output
 Server response:
@@ -260,6 +260,6 @@ Pod log output:
 [pod/ollama-amd64-deployment-cbfc4b865-msftf/ollama-multiarch] 2025-03-25T21:13:49.022522588Z
 ```
 
-If you see the output `Ollama is running` you have successfully bootstrapped your GKE cluster with an amd64 node, running a deployment with the Ollama multi-architecture container instance!
+If you see the output `Ollama is running` you have successfully bootstrapped your GKE cluster with an amd64 node, running a deployment with the Ollama multi-architecture container instance.
 
 Continue to the next section to do the same thing, but with an Arm node. 
@@ -6,10 +6,9 @@ weight: 4
 layout: learningpathall
 ---
 
-## Overview
-At this point you have a what many people in their K8s Arm journey start with -- a workload running on an amd64 cluster. As mentioned earlier, the easiest way to experiment with Arm in your K8s cluster is to run both architectures simultaneously, not just for the sake of learning how to do it, but also to see first-hand the price/performance advantages of running Arm-based nodes.
+You have reached the point from which most projects start investigating migration to Arm. You have a workload running on an amd64 cluster and you want to evaluate the benefits of Arm.
 
-Next, you'll add an Arm-based node pool to the cluster, and from there, apply an ollama Arm deployment and service to mimic what we did in the last chapter.
+In this section, you will add an Arm-based node pool to the cluster, and apply an Ollama Arm deployment and service to mimic what you did in the previous section.
 
 ### Adding the arm64-pool node pool
 
@@ -27,29 +26,30 @@ To add Arm nodes to the cluster:
 7. Select *C4A* : *c4a-standard-4* for Machine *Configuration/Type*.
 
 {{% notice Note %}}
-To make an apples-to-apples comparison of amd64 and arm64 performance, the c4a-standard-4 is spun up as the arm64 "equivalent" of the previously deployed c4-standard-4 in the amd64 node pool.
+To compare amd64 and arm64 performance, the c4a-standard-4 is used as the arm64 equivalent of the previously deployed c4-standard-4 in the amd64 node pool.
 {{% /notice %}}
 
 ![YAML Overview](images/arm_node_config-2.png)
 
 8. Select *Create*
 9. After provisioning completes, select the newly created *arm64-pool* from the *Clusters* screen to take you to the *Node pool details* page.
 
-Note the taint GKE applies by default to the Arm Node of *NoSchedule* if arch=arm64:
+Notice the taint below that GKE applies by default to the Arm node of `NoSchedule` if `arch=arm64`:
 
 ![arm node taint](images/taint_on_arm_node.png)
 
-Without a toleration for this taint, we won't be able to schedule any workloads on it!  But do not fear, as the nodeSelector in the amd64 (and as you will shortly see, the arm64) Deployment YAMLs not only defines which architecture to target, [but in the arm64 use case](https://cloud.google.com/kubernetes-engine/docs/how-to/prepare-arm-workloads-for-deployment#schedule-with-node-selector-arm), it also adds the required toleration automatically.
+Without a toleration for this taint, you won't be able to schedule any workloads on it. The nodeSelector in the amd64 (and as you will shortly see, the arm64) deployment YAMLs not only defines which architecture to target, [but in the arm64 use case](https://cloud.google.com/kubernetes-engine/docs/how-to/prepare-arm-workloads-for-deployment#schedule-with-node-selector-arm), it also adds the required toleration automatically.
 
 ```yaml
 nodeSelector:
-    kubernetes.io/arch: arm64 # or amd64
+    kubernetes.io/arch: arm64 
 ```
 
-### Deployment and Service
-We can now apply the arm64-based deployment.
+### Deployment and service
 
-1. Copy the following YAML, and save it to a file called arm64_ollama.yaml:
+You can now apply the arm64-based deployment.
+
+1. Use a text editor to copy the following YAML, and save it to a file called `arm64_ollama.yaml`:
 
 ```yaml
 apiVersion: apps/v1
@@ -121,40 +121,40 @@ spec:
 
 When the above is applied:
 
-* A new Deployment called *ollama-arm64-deployment* is created.  Like the amd64 deployment, it pulls the same multi-architectural (both amd64 and arm64) image from Dockerhub [ollama image from Dockerhub](https://hub.docker.com/layers/ollama/ollama/0.6.1/images/sha256-28b909914d4e77c96b1c57dea199c60ec12c5050d08ed764d9c234ba2944be63).
+* A new Deployment called `ollama-arm64-deployment` is created.  Like the amd64 deployment, it pulls the same multi-architecture image from DockerHub.
 
-Of particular interest is the *nodeSelector* *kubernetes.io/arch*, with the value of *arm64*.  This will ensure that this deployment only runs on arm64-based nodes, utilizing the arm64 layer of the ollama multi-architecture container image. As mentioned earlier, this *nodeSelector* triggers the automatic creation of the toleration for the arm64 nodes.
+Of particular interest is the `nodeSelector` `kubernetes.io/arch`, with the value of `arm64`.  This ensures that the deployment runs on arm64-based nodes, utilizing the arm64 layer of the Ollama multi-architecture container image. The `nodeSelector` triggers the automatic creation of the toleration for the arm64 nodes.
 
-* Two new load balancer Services are created.  The first, *ollama-arm64-svc* is created, analogous to the existing service, targets all pods with the *arch: arm64* label (our arm64 deployment creates these pods.)  The second service, *ollama-multiarch-svc*, target ALL Pods, regardless of the architecture they are running.  This service will show us how we can mix and match pods in production to serve the same app regardless of node/pod architecture.
+* Two new load balancer services are created.  The first, `ollama-arm64-svc` is created, analogous to the existing service, and targets all pods with the `arch: arm64` label (the arm64 deployment creates these pods).  The second service, `ollama-multiarch-svc`, targets all pods, regardless of the architecture. This service shows how you can mix and match pods in production to serve the same application regardless of node/pod architecture.
 
-You may also notice that a *sessionAffinity* tag was added to this Service to remove sticky connections to the target pods; this removes persistent connections to the same pod on each request.
+A `sessionAffinity` tag is added to this service to remove sticky connections to the target pods. This removes persistent connections to the same pod on each request.
 
 
 ### Apply the arm64 Deployment and Service
 
-1. Run the following command to apply the arm64 deployment, and service definitions:
+1. Run the following command to apply the arm64 deployment and service definitions:
 
 ```bash
 kubectl apply -f arm64_ollama.yaml
 ```
 
-You should get the following responses back:
+You see the following responses:
 
-```bash
+```output
 deployment.apps/ollama-arm64-deployment created
 service/ollama-arm64-svc created
 service/ollama-multiarch-svc created
 ```
 
-2. Get the status of the pods, and the services, by running the following:
+2. Get the status of the pods and the services by running the following:
 
-```commandline
+```bash
 kubectl get nodes,pods,svc -nollama 
 ```
 
-Your output should be similar to the following, showing two nodes, two pods, and three services:
+Your output is similar to the following, showing two nodes, two pods, and three services:
 
-```commandline
+```output
 NAME                                              STATUS   ROLES    AGE     VERSION
 node/gke-ollama-on-arm-amd64-pool-62c0835c-93ht   Ready    <none>   91m     v1.31.6-gke.1020000
 node/gke-ollama-on-arm-arm64-pool-2ae0d1f0-pqrf   Ready    <none>   4m11s   v1.31.6-gke.1020000
@@ -169,21 +169,23 @@ service/ollama-arm64-svc       LoadBalancer   1.2.3.4          1.2.3.4
 service/ollama-multiarch-svc   LoadBalancer   1.2.3.4          1.2.3.4          80:30667/TCP   2m52s
 ```
 
-When the pods show *Running* and the service shows a valid *External IP*, we're ready to test the ollama arm64 service!
+When the pods show `Running` and the service shows a valid `External IP`, you are ready to test the Ollama arm64 service.
 
-### Test the ollama on arm web service 
+### Test the Ollama web service on arm64
 
-To test the service, use the previously created model_util.sh from the last section; instead of the *amd64* parameter, replace it with *arm64*:
+To test the service, use the previously created `model_util.sh` from the previous section.
+
+Instead of the `amd64` parameter, replace it with `arm64`:
 
 3. Run the following to make an HTTP request to the amd64 ollama service on port 80:
 
-```commandline
+```bash
 ./model_util.sh arm64 hello
 ```
 
-You should get back the HTTP response, as well as the logline from the pod that served it:
+You get back the HTTP response, as well as the log line from the pod that served it:
 
-```commandline
+```output
 Server response:
 Using service endpoint 34.44.135.90 for hello on arm64
 Ollama is running
@@ -192,6 +194,7 @@ Pod log output:
 
 [pod/ollama-arm64-deployment-678dc8556f-956d6/ollama-multiarch] 2025-03-25T21:25:21.547384356Z
 ```
-Once again, we're looking for "Ollama is running".  If you see that, congrats, you've successfully setup your GKE cluster with both amd64 and arm64 nodes and pods running a Deployment with the ollama multi-architecture container!
 
-Next, let's do some simple analysis of the cluster's performance.
+Once again, if you see "Ollama is running" then you have successfully setup your GKE cluster with both amd64 and arm64 nodes and pods running a deployment with the Ollama multi-architecture container.
+
+Continue to the next section to analyze the performance.