Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,23 +1,19 @@
---
title: Deploy Kafka on the Microsoft Azure Cobalt 100 processors

draft: true
cascade:
draft: true
title: Deploy Apache Kafka on Arm-based Microsoft Azure Cobalt 100 virtual machines

minutes_to_complete: 30

who_is_this_for: This is an advanced topic designed for software developers looking to migrate their Kafka workloads from x86_64 to Arm-based platforms, specifically on the Microsoft Azure Cobalt 100 processors.
who_is_this_for: This is an advanced topic for developers looking to migrate their Apache Kafka workloads from x86_64 to Arm-based platforms, specifically on Microsoft Azure Cobalt 100 (arm64) virtual machines.

learning_objectives:
- Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image.
- Deploy Kafka on the Ubuntu virtual machine.
- Perform Kafka baseline testing and benchmarking on Arm64 virtual machines.
- Provision an Azure Arm64 virtual machine using Azure console, with Ubuntu Pro 24.04 LTS as the base image
- Deploy Kafka on an Ubuntu virtual machine
- Perform Kafka baseline testing and benchmarking on Arm64 virtual machines

prerequisites:
- A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6).
- Basic understanding of Linux command line.
- Familiarity with the [Apache Kafka architecture](https://kafka.apache.org/) and deployment practices on Arm64 platforms.
- A [Microsoft Azure](https://azure.microsoft.com/) account with access to Cobalt 100 based instances (Dpsv6)
- Basic understanding of Linux command line
- Familiarity with the [Apache Kafka architecture](https://kafka.apache.org/) and deployment practices on Arm64 platforms

author: Pareena Verma

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ layout: "learningpathall"

## Cobalt 100 Arm-based processor

Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each vCPU, ensuring consistent and predictable performance.
Azure’s Cobalt 100 is built on Microsoft's first-generation, in-house Arm-based processor: the Cobalt 100. Designed entirely by Microsoft and based on Arm’s Neoverse N2 architecture, this 64-bit CPU delivers improved performance and energy efficiency across a broad spectrum of cloud-native, scale-out Linux workloads. These include web and application servers, data analytics, open-source databases, caching systems, and more. Running at 3.4 GHz, the Cobalt 100 processor allocates a dedicated physical core for each virtual CPU (vCPU), ensuring consistent and predictable performance.

To learn more about Cobalt 100, refer to the blog [Announcing the preview of new Azure virtual machine based on the Azure Cobalt 100 processor](https://techcommunity.microsoft.com/blog/azurecompute/announcing-the-preview-of-new-azure-vms-based-on-the-azure-cobalt-100-processor/4146353).

Expand All @@ -17,4 +17,4 @@ Apache Kafka is a high-performance, open-source distributed event streaming plat

It allows you to publish, subscribe to, store, and process streams of records in a fault-tolerant and scalable manner. Kafka stores data in topics, which are partitioned and replicated across a cluster to ensure durability and high availability.

Kafka is widely used for messaging, log aggregation, event sourcing, real-time analytics, and integrating large-scale data systems. Learn more from the [Apache Kafka official website](https://kafka.apache.org/) and its [official documentation](https://kafka.apache.org/documentation).
Kafka is widely used for messaging, log aggregation, event sourcing, real-time analytics, and integrating large-scale data systems. Learn more from the [Apache Kafka official website](https://kafka.apache.org/) and the [Apache Kafka documentation](https://kafka.apache.org/documentation).
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Baseline Testing
title: Run baseline testing with Kafka on Azure Arm VM
weight: 5

### FIXED, DO NOT MODIFY
Expand All @@ -12,60 +12,64 @@ After installing Apache Kafka 4.1.0 on your Azure Cobalt 100 Arm64 virtual machi
Kafka 4.1.0 introduces KRaft mode (Kafka Raft Metadata mode), which integrates the control and data planes, eliminating the need for ZooKeeper.
This simplifies deployment, reduces latency, and provides a unified, self-managed Kafka cluster architecture.

To perform this baseline test, you will use four terminal sessions:
Terminal 1: Start the Kafka broker (in KRaft mode).
Terminal 2: Create a topic.
Terminal 3: Send messages (Producer).
Terminal 4: Read messages (Consumer).
To run this baseline test, open four terminal sessions:

### Initial Setup: Configure & Format KRaft
KRaft (Kafka Raft) replaces ZooKeeper by embedding metadata management directly into the Kafka broker.
This improves scalability, reduces external dependencies, and speeds up controller failover in distributed clusters.
Before starting Kafka in KRaft mode, configure and initialize the storage directory. These steps are required only once per broker.
- **Terminal 1:** Start the Kafka broker in KRaft mode.
- **Terminal 2:** Create a topic.
- **Terminal 3:** Send messages as the producer.
- **Terminal 4:** Read messages as the consumer.

1. Edit the Configuration File
Open the Kafka configuration file in an editor:
Each terminal has a specific role, helping you verify that Kafka works end-to-end on your Arm64 VM.
## Configure and format KRaft

```console
vi /opt/kafka/config/server.properties
```
KRaft (Kafka Raft) mode replaces ZooKeeper by managing metadata directly within the Kafka broker. This change improves scalability, reduces external dependencies, and speeds up controller failover in distributed clusters.

2. Add or Modify KRaft Properties
Ensure the following configuration entries are present for a single-node KRaft setup:
Before you start Kafka in KRaft mode, you need to configure the broker and initialize the storage directory. You only need to do this once for each broker.

```java
process.roles=controller,broker
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kraft-combined-logs
```
This configuration file sets up a single Kafka server to act as both a controller (managing cluster metadata) and a broker (handling data), running in KRaft mode. It defines the node's unique ID and specifies the local host as the sole participant in the controller quorum.

3. Format the Storage Directory
Format the metadata storage directory using the kafka-storage.sh tool. This initializes KRaft’s internal Raft logs with a unique cluster ID.
## Edit the configuration file
Open the Kafka configuration file in an editor:

```console
bin/kafka-storage.sh format -t $(bin/kafka-storage.sh random-uuid) -c config/server.properties
```
You should see output similar to:
```console
vi /opt/kafka/config/server.properties
```

```output
Formatting metadata directory /tmp/kraft-combined-logs with metadata.version 4.1-IV1.
```
This confirms that the Kafka storage directory has been successfully formatted and that the broker is ready to start in KRaft mode.
## Add or modify KRaft properties
Ensure the following configuration entries are present for a single-node KRaft setup:

```java
process.roles=controller,broker
node.id=1
controller.quorum.voters=1@localhost:9093
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kraft-combined-logs
```
This configuration file sets up a single Kafka server to act as both a controller (managing cluster metadata) and a broker (handling data), running in KRaft mode. It defines the node's unique ID and specifies the local host as the sole participant in the controller quorum.

## Format the storage directory
Format the metadata storage directory using the kafka-storage.sh tool. This initializes KRaft’s internal Raft logs with a unique cluster ID.

```console
bin/kafka-storage.sh format -t $(bin/kafka-storage.sh random-uuid) -c config/server.properties
```
You should see output similar to:

```output
Formatting metadata directory /tmp/kraft-combined-logs with metadata.version 4.1-IV1.
```
This confirms that the Kafka storage directory has been successfully formatted and that the broker is ready to start in KRaft mode.

## Perform the Baseline Test
## Perform the baseline test
With Kafka 4.1.0 installed and configured in KRaft mode, you’re now ready to run a baseline test to verify that the Kafka broker starts correctly, topics can be created, and message flow works as expected.

You’ll use multiple terminals for this test:
Terminal 1: Start the Kafka broker.
Terminal 2: Create and verify a topic.
Terminal 3: Send messages (Producer).
Terminal 4: Read messages (Consumer).
Terminal 1: start the Kafka broker
Terminal 2: create and verify a topic
Terminal 3: send messages (Producer)
Terminal 4: read messages (Consumer)

### Terminal 1 – Start Kafka Broker
## Terminal 1 - start Kafka broker
Start the Kafka broker (the main server process responsible for managing topics and handling messages) in KRaft mode:

```console
Expand All @@ -74,7 +78,7 @@ bin/kafka-server-start.sh config/server.properties
```
Keep this terminal open and running. The broker process must stay active for all subsequent commands.

### Terminal 2 – Create a Topic
## Terminal 2 - create a topic
Open a new terminal window. Create a topic named test-topic-kafka, which acts as a logical channel where producers send and consumers receive messages:

```console
Expand All @@ -87,8 +91,21 @@ You should see output similar to:
Created topic test-topic-kafka.
```

**Verify Topic Creation**
List available topics to confirm that your new topic was created successfully:
## Verify topic creation
List available topics to confirm that your new topic was created successfully. Run the following command:

```console
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
```

The expected output is:

```output
__consumer_offsets
test-topic-kafka
```

If you see `test-topic-kafka` in the list, your topic was created and is ready for use.

```console
bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Expand All @@ -102,7 +119,7 @@ test-topic-kafka
Kafka is now running, and you’ve successfully created and verified a topic.
Next, you’ll use Terminal 3 to produce messages and Terminal 4 to consume messages, completing the baseline functional test on your Arm64 environment.

### Terminal 3 – Console Producer (Write Message)
## Terminal 3 - console producer (write message)
In this step, you’ll start the Kafka Producer, which publishes messages to the topic test-topic-kafka. The producer acts as the data source, sending messages to the Kafka broker.

```console
Expand All @@ -117,8 +134,8 @@ hello from azure arm vm
```
Each line you type is sent as a message to the Kafka topic and stored on disk by the broker.

### Terminal 4 – Console Consumer (Read Message)
Next, open another terminal and start the Kafka Consumer, which subscribes to the same topic (test-topic-kafka) and reads messages from the beginning of the log.
## Terminal 4 - console consumer (read message)
Next, open another terminal and start the Kafka Consumer, which subscribes to the same topic (test-topic-kafka) and reads messages from the beginning of the log:

```console
cd /opt/kafka
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Benchmarking with Official Kafka Tools
title: Benchmark with official Kafka tools
weight: 6

### FIXED, DO NOT MODIFY
Expand All @@ -13,7 +13,7 @@ Apache Kafka includes official performance testing utilities that allow you to m
## Steps for Kafka Benchmarking

Before running the benchmarks, make sure your Kafka broker is already active in a separate terminal (as configured in the previous section).
Now open two new terminal sessionsone for running the producer benchmark and another for the consumer benchmark.
Now open two new terminal sessions; one for running the producer benchmark, and the other for the consumer benchmark.

### Terminal A - Producer Benchmark

Expand Down Expand Up @@ -107,9 +107,6 @@ The producer sustained a throughput of ~257,500 records/sec (~24.5 MB/sec) with
The 95th percentile latency (1168 ms) and 99th percentile (1220 ms) show predictable network and I/O performance.
Kafka maintained consistent throughput, even under full-speed production, with no message loss or broker errors reported.

### Benchmark Comparison Insights
When analyzing performance on Azure Cobalt 100 Arm64 virtual machines:
**Producer efficiency**: The producer reached ~23–25 MB/sec throughput with average latencies below 900 ms, demonstrating stable delivery rates for high-volume workloads.
**Consumer scalability**: The consumer maintained ~262K messages/sec throughput with near-linear scaling of fetch performance — exceeding 1.85M messages/sec internally.
**Performance stability**: Both producer and consumer benchmarks showed low jitter and consistent latency distribution across iterations, confirming Kafka’s predictable behavior on Arm-based VMs.
### Benchmark comparison insights
When analyzing performance on Azure Cobalt 100 Arm64 virtual machines, you’ll notice that Kafka delivers stable and predictable results for both producers and consumers. The producer consistently achieves throughput between 23 MB/sec and 25 MB/sec, with average latencies below 900 ms. This means you can rely on efficient message delivery, even when handling high-volume workloads. On the consumer side, throughput remains strong at around 262,000 messages per second, and fetch performance scales nearly linearly, often exceeding 1.85 million messages per second internally. Throughout multiple benchmark runs, both producer and consumer tests demonstrate low jitter and consistent latency distribution, confirming that Kafka maintains reliable performance on Arm-based virtual machines.

Loading