Merge pull request #2552 from madeline-underwood/kafka

jasonrandrews · web-flow · commit d13829e63745 · 2025-11-18T12:07:32.000-06:00
Kafka_JA to sign off
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/_index.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/_index.md
@@ -1,23 +1,19 @@
 ---
-title: Deploy Kafka on the Microsoft Azure Cobalt 100 processors 
-
-draft: true
-cascade:
-    draft: true
+title: Deploy Apache Kafka on Arm-based Microsoft Azure Cobalt 100 virtual machines 
 
 minutes_to_complete: 30   
 
-who_is_this_for: This is an advanced topic designed for software developers looking to migrate their Kafka workloads from x86_64 to Arm-based platforms, specifically on the Microsoft Azure Cobalt 100 processors.
+who_is_this_for: This is an advanced topic for developers looking to migrate their Apache Kafka workloads from x86_64 to Arm-based platforms, specifically on Microsoft Azure Cobalt 100 (arm64) virtual machines.
 
 learning_objectives: 
-    - Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image.
-    - Deploy Kafka on the Ubuntu virtual machine.
-    - Perform Kafka baseline testing and benchmarking on Arm64 virtual machines.
+    - Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image
+    - Deploy Kafka on an Ubuntu virtual machine
+    - Perform Kafka baseline testing and benchmarking on Arm64 virtual machines
 
 prerequisites:
-    - A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6). 
-    - Basic understanding of Linux command line.
-    - Familiarity with the [Apache Kafka architecture](https://kafka.apache.org/) and deployment practices on Arm64 platforms.
+    - A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6)
+    - Basic understanding of Linux command line
+    - Familiarity with the [Apache Kafka architecture](https://kafka.apache.org/) and deployment practices on Arm64 platforms
 
 author: Pareena Verma
 
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/background.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/background.md
@@ -8,7 +8,7 @@ layout: "learningpathall"
 
 ## Cobalt 100 Arm-based processor
 
-Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each vCPU, ensuring consistent and predictable performance. 
+Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each virtual CPU (vCPU), ensuring consistent and predictable performance. 
 
 To learn more about Cobalt 100, refer to the blog [Announcing the preview of new Azure virtual machine based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353).
 
@@ -17,4 +17,4 @@ Apache Kafka is a high-performance, open-source distributed event streaming plat
 
 It allows you to publish, subscribe to, store, and process streams of records in a fault-tolerant and scalable manner. Kafka stores data in topics, which are partitioned and replicated across a cluster to ensure durability and high availability.
 
-Kafka is widely used for messaging, log aggregation, event sourcing, real-time analytics, and integrating large-scale data systems. Learn more from the [Apache Kafka official website](https://kafka.apache.org/) and its [official documentation](https://kafka.apache.org/documentation).
+Kafka is widely used for messaging, log aggregation, event sourcing, real-time analytics, and integrating large-scale data systems. Learn more from the [Apache Kafka official website](https://kafka.apache.org/) and the [Apache Kafka documentation](https://kafka.apache.org/documentation).
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/baseline.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/baseline.md
@@ -1,5 +1,5 @@
 ---
-title: Baseline Testing
+title: Run baseline testing with Kafka on Azure Arm VM
 weight: 5
 
 ### FIXED, DO NOT MODIFY
@@ -12,60 +12,64 @@ After installing Apache Kafka 4.1.0 on your Azure Cobalt 100 Arm64 virtual machi
 Kafka 4.1.0 introduces KRaft mode (Kafka Raft Metadata mode), which integrates the control and data planes, eliminating the need for ZooKeeper.
 This simplifies deployment, reduces latency, and provides a unified, self-managed Kafka cluster architecture.
 
-To perform this baseline test, you will use four terminal sessions:
-Terminal 1: Start the Kafka broker (in KRaft mode).
-Terminal 2: Create a topic.
-Terminal 3: Send messages (Producer).
-Terminal 4: Read messages (Consumer).
+To run this baseline test, open four terminal sessions:
 
-### Initial Setup: Configure & Format KRaft
-KRaft (Kafka Raft) replaces ZooKeeper by embedding metadata management directly into the Kafka broker.
-This improves scalability, reduces external dependencies, and speeds up controller failover in distributed clusters.
-Before starting Kafka in KRaft mode, configure and initialize the storage directory. These steps are required only once per broker.
+- **Terminal 1:** Start the Kafka broker in KRaft mode.
+- **Terminal 2:** Create a topic.
+- **Terminal 3:** Send messages as the producer.
+- **Terminal 4:** Read messages as the consumer.
 
-1. Edit the Configuration File
-Open the Kafka configuration file in an editor:
+Each terminal has a specific role, helping you verify that Kafka works end-to-end on your Arm64 VM.
+## Configure and format KRaft
 
-```console
-vi /opt/kafka/config/server.properties
-```
+KRaft (Kafka Raft) mode replaces ZooKeeper by managing metadata directly within the Kafka broker. This change improves scalability, reduces external dependencies, and speeds up controller failover in distributed clusters.
 
-2. Add or Modify KRaft Properties
-Ensure the following configuration entries are present for a single-node KRaft setup:
+Before you start Kafka in KRaft mode, you need to configure the broker and initialize the storage directory. You only need to do this once for each broker.
 
-```java
-process.roles=controller,broker
-node.id=1
-controller.quorum.voters=1@localhost:9093
-listeners=PLAINTEXT://:9092,CONTROLLER://:9093
-advertised.listeners=PLAINTEXT://localhost:9092
-log.dirs=/tmp/kraft-combined-logs
-```
-This configuration file sets up a single Kafka server to act as both a controller (managing cluster metadata) and a broker (handling data), running in KRaft mode. It defines the node's unique ID and specifies the local host as the sole participant in the controller quorum.
 
-3. Format the Storage Directory
-Format the metadata storage directory using the kafka-storage.sh tool. This initializes KRaft’s internal Raft logs with a unique cluster ID.
+ ## Edit the configuration file
+    Open the Kafka configuration file in an editor:
 
-```console
-bin/kafka-storage.sh format -t $(bin/kafka-storage.sh random-uuid) -c config/server.properties
-```
-You should see output similar to:
+    ```console
+    vi /opt/kafka/config/server.properties
+    ```
 
-```output
-Formatting metadata directory /tmp/kraft-combined-logs with metadata.version 4.1-IV1.
-```
-This confirms that the Kafka storage directory has been successfully formatted and that the broker is ready to start in KRaft mode.
+## Add or modify KRaft properties
+    Ensure the following configuration entries are present for a single-node KRaft setup:
+
+    ```java
+    process.roles=controller,broker
+    node.id=1
+    controller.quorum.voters=1@localhost:9093
+    listeners=PLAINTEXT://:9092,CONTROLLER://:9093
+    advertised.listeners=PLAINTEXT://localhost:9092
+    log.dirs=/tmp/kraft-combined-logs
+    ```
+    This configuration file sets up a single Kafka server to act as both a controller (managing cluster metadata) and a broker (handling data), running in KRaft mode. It defines the node's unique ID and specifies the local host as the sole participant in the controller quorum.
+
+## Format the storage directory
+    Format the metadata storage directory using the kafka-storage.sh tool. This initializes KRaft’s internal Raft logs with a unique cluster ID.
+
+    ```console
+    bin/kafka-storage.sh format -t $(bin/kafka-storage.sh random-uuid) -c config/server.properties
+    ```
+    You should see output similar to:
+
+    ```output
+    Formatting metadata directory /tmp/kraft-combined-logs with metadata.version 4.1-IV1.
+    ```
+    This confirms that the Kafka storage directory has been successfully formatted and that the broker is ready to start in KRaft mode.
 
-## Perform the Baseline Test
+## Perform the baseline test
 With Kafka 4.1.0 installed and configured in KRaft mode, you’re now ready to run a baseline test to verify that the Kafka broker starts correctly, topics can be created, and message flow works as expected.
 
 You’ll use multiple terminals for this test:
-Terminal 1: Start the Kafka broker.
-Terminal 2: Create and verify a topic.
-Terminal 3: Send messages (Producer).
-Terminal 4: Read messages (Consumer).
+Terminal 1: start the Kafka broker
+Terminal 2: create and verify a topic
+Terminal 3: send messages (Producer)
+Terminal 4: read messages (Consumer)
 
-### Terminal 1 – Start Kafka Broker
+## Terminal 1 - start Kafka broker
 Start the Kafka broker (the main server process responsible for managing topics and handling messages) in KRaft mode:
 
 ```console
@@ -74,7 +78,7 @@ bin/kafka-server-start.sh config/server.properties
 ```
 Keep this terminal open and running. The broker process must stay active for all subsequent commands.
 
-### Terminal 2 – Create a Topic
+## Terminal 2 - create a topic
 Open a new terminal window. Create a topic named test-topic-kafka, which acts as a logical channel where producers send and consumers receive messages:
 
 ```console
@@ -87,8 +91,21 @@ You should see output similar to:
 Created topic test-topic-kafka.
 ```
 
-**Verify Topic Creation**
-List available topics to confirm that your new topic was created successfully:
+## Verify topic creation
+List available topics to confirm that your new topic was created successfully. Run the following command:
+
+```console
+bin/kafka-topics.sh --list --bootstrap-server localhost:9092
+```
+
+The expected output is:
+
+```output
+__consumer_offsets
+test-topic-kafka
+```
+
+If you see `test-topic-kafka` in the list, your topic was created and is ready for use.
 
 ```console
 bin/kafka-topics.sh --list --bootstrap-server localhost:9092
@@ -102,7 +119,7 @@ test-topic-kafka
 Kafka is now running, and you’ve successfully created and verified a topic.
 Next, you’ll use Terminal 3 to produce messages and Terminal 4 to consume messages, completing the baseline functional test on your Arm64 environment.
 
-### Terminal 3 – Console Producer (Write Message)
+## Terminal 3 - console producer (write message)
 In this step, you’ll start the Kafka Producer, which publishes messages to the topic test-topic-kafka. The producer acts as the data source, sending messages to the Kafka broker.
 
 ```console
@@ -117,8 +134,8 @@ hello from azure arm vm
 ```
 Each line you type is sent as a message to the Kafka topic and stored on disk by the broker.
 
-### Terminal 4 – Console Consumer (Read Message)
-Next, open another terminal and start the Kafka Consumer, which subscribes to the same topic (test-topic-kafka) and reads messages from the beginning of the log.
+## Terminal 4 - console consumer (read message)
+Next, open another terminal and start the Kafka Consumer, which subscribes to the same topic (test-topic-kafka) and reads messages from the beginning of the log:
 
 ```console
 cd /opt/kafka
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/benchmarking.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/benchmarking.md
@@ -1,5 +1,5 @@
 ---
-title: Benchmarking with Official Kafka Tools
+title: Benchmark with official Kafka tools
 weight: 6
 
 ### FIXED, DO NOT MODIFY
@@ -13,7 +13,7 @@ Apache Kafka includes official performance testing utilities that allow you to m
 ## Steps for Kafka Benchmarking 
 
 Before running the benchmarks, make sure your Kafka broker is already active in a separate terminal (as configured in the previous section).
-Now open two new terminal sessions — one for running the producer benchmark and another for the consumer benchmark.
+Now open two new terminal sessions; one for running the producer benchmark, and the other for the consumer benchmark.
 
 ### Terminal A - Producer Benchmark
 
@@ -107,9 +107,6 @@ The producer sustained a throughput of ~257,500 records/sec (~24.5 MB/sec) with
 The 95th percentile latency (1168 ms) and 99th percentile (1220 ms) show predictable network and I/O performance.
 Kafka maintained consistent throughput, even under full-speed production, with no message loss or broker errors reported.
 
-### Benchmark Comparison Insights
-When analyzing performance on Azure Cobalt 100 Arm64 virtual machines:
-   **Producer efficiency**: The producer reached ~23–25 MB/sec throughput with average latencies below 900 ms, demonstrating stable delivery rates for high-volume workloads.
-   **Consumer scalability**: The consumer maintained ~262K messages/sec throughput with near-linear scaling of fetch performance — exceeding 1.85M messages/sec internally.
-   **Performance stability**: Both producer and consumer benchmarks showed low jitter and consistent latency distribution across iterations, confirming Kafka’s predictable behavior on Arm-based VMs.
+### Benchmark comparison insights
+When analyzing performance on Azure Cobalt 100 Arm64 virtual machines, you’ll notice that Kafka delivers stable and predictable results for both producers and consumers. The producer consistently achieves throughput between 23 MB/sec and 25 MB/sec, with average latencies below 900 ms. This means you can rely on efficient message delivery, even when handling high-volume workloads. On the consumer side, throughput remains strong at around 262,000 messages per second, and fetch performance scales nearly linearly, often exceeding 1.85 million messages per second internally. Throughout multiple benchmark runs, both producer and consumer tests demonstrate low jitter and consistent latency distribution, confirming that Kafka maintains reliable performance on Arm-based virtual machines.
 
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/create-instance.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/create-instance.md
diff --git a/content/learning-paths/servers-and-cloud-computing/kafka-azure/deploy.md b/content/learning-paths/servers-and-cloud-computing/kafka-azure/deploy.md