Skip to content

Commit 5528327

Browse files
committed
kafka_tutorial
1 parent cadbfaf commit 5528327

File tree

2 files changed

+204
-0
lines changed

2 files changed

+204
-0
lines changed

articles/hdinsight/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -550,6 +550,8 @@
550550
href: ./kafka/apache-kafka-quickstart-resource-manager-template.md
551551
- name: Tutorials
552552
items:
553+
- name: Create Apache Kafka cluster - Azure CLI
554+
href: ./kafka/tutorial-cli-rest-proxy.md
553555
- name: Use Apache Kafka Producer and Consumer API
554556
href: ./kafka/apache-kafka-producer-consumer-api.md
555557
- name: Develop app with Apache Kafka Streams API
Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
---
2+
title: 'Tutorial: Create an Apache Kafka REST proxy enabled cluster in HDInsight using Azure CLI'
3+
description: Learn how to perform Apache Kafka operations using a Kafka REST proxy on Azure HDInsight.
4+
author: hrasheed-msft
5+
ms.author: hrasheed
6+
ms.reviewer: hrasheed
7+
ms.service: hdinsight
8+
ms.topic: tutorial
9+
ms.date: 02/27/2020
10+
---
11+
12+
# Tutorial: Create an Apache Kafka REST proxy enabled cluster in HDInsight using Azure CLI
13+
14+
In this tutorial, you learn how to create an Apache Kafka [REST proxy enabled](./rest-proxy.md) cluster in Azure HDInsight using Azure command-line interface (CLI). Azure HDInsight is a managed, full-spectrum, open-source analytics service for enterprises. Apache Kafka is an open-source, distributed streaming platform. It's often used as a message broker, as it provides functionality similar to a publish-subscribe message queue. Kafka REST Proxy enables you to interact with your Kafka cluster via a [REST API](https://docs.microsoft.com/rest/api/hdinsight-kafka-rest-proxy/) over HTTP. The Azure CLI is Microsoft's cross-platform command-line experience for managing Azure resources.
15+
16+
The Apache Kafka API can only be accessed by resources inside the same virtual network. You can access the cluster directly using SSH. To connect other services, networks, or virtual machines to Apache Kafka, you must first create a virtual network and then create the resources within the network. For more information, see [Connect to Apache Kafka using a virtual network](./apache-kafka-connect-vpn-gateway.md).
17+
18+
In this tutorial, you learn:
19+
20+
> [!div class="checklist"]
21+
> * Prerequisites for Kafka REST proxy
22+
> * Create an Apache Kafka cluster using Azure CLI
23+
24+
If you don’t have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
25+
26+
## Prerequisites
27+
28+
* An application registered with Azure AD. The client applications that you write to interact with the Kafka REST proxy will use this application's ID and secret to authenticate to Azure. For more information, see [Register an application with the Microsoft identity platform](../../active-directory/develop/quickstart-register-app.md).
29+
30+
* An Azure AD security group with your registered application as a member. This security group will be used to control which applications are allowed to interact with the REST proxy. For more information on creating Azure AD groups, see [Create a basic group and add members using Azure Active Directory](../../active-directory/fundamentals/active-directory-groups-create-azure-portal.md).
31+
32+
* Azure CLI. Ensure you have at least version 2.0.79. See [Install the Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli).
33+
34+
## Create an Apache Kafka cluster
35+
36+
1. Sign in to your Azure subscription.
37+
38+
```azurecli
39+
az login
40+
41+
# If you have multiple subscriptions, set the one to use
42+
# az account set --subscription "SUBSCRIPTIONID"
43+
```
44+
45+
1. Set environment variables. The use of variables in this tutorial is based on Bash. Slight variations will be needed for other environments.
46+
47+
|Variable | Description |
48+
|---|---|
49+
|resourceGroupName|Replace RESOURCEGROUPNAME with the name for your new resource group.|
50+
|location|Replace LOCATION with a region where the cluster will be created. For a list of valid locations, use the `az account list-locations` command|
51+
|clusterName|Replace CLUSTERNAME with a globally unique name for your new cluster.|
52+
|storageAccount|Replace STORAGEACCOUNTNAME with a name for your new storage account.|
53+
|httpPassword|Replace PASSWORD with a password for the cluster login, **admin**.|
54+
|sshPassword|Replace PASSWORD with a password for the secure shell username, **sshuser**.|
55+
|securityGroupName|Replace SECURITYGROUPNAME with the client AAD security group name for Kafka Rest Proxy. The variable will be passed to the `--kafka-client-group-name` parameter for `az-hdinsight-create`.|
56+
|securityGroupID|Replace SECURITYGROUPID with the client AAD security group id for Kafka Rest Proxy. The variable will be passed to the `--kafka-client-group-id` parameter for `az-hdinsight-create`.|
57+
|storageContainer|Storage container the cluster will use, leave as-is for this tutorial. This variable will be set with the name of the cluster.|
58+
|workernodeCount|Number of worker nodes in the cluster, leave as-is for this tutorial. To guarantee high availability, Kafka requires a minimum of 3 worker nodes|
59+
|clusterType|Type of HDInsight cluster, leave as-is for this tutorial.|
60+
|clusterVersion|HDInsight cluster version, leave as-is for this tutorial. Kafka Rest Proxy requires a minimum cluster version of 4.0.|
61+
|componentVersion|Kafka version, leave as-is for this tutorial. Kafka Rest Proxy requires a minimum component version of 2.1.|
62+
63+
Update the variables with desired values. Then enter the CLI commands to set the environment variables.
64+
65+
```azurecli
66+
export resourceGroupName=RESOURCEGROUPNAME
67+
export location=LOCATION
68+
export clusterName=CLUSTERNAME
69+
export storageAccount=STORAGEACCOUNTNAME
70+
export httpPassword='PASSWORD'
71+
export sshPassword='PASSWORD'
72+
export securityGroupName=SECURITYGROUPNAME
73+
export securityGroupID=SECURITYGROUPID
74+
75+
export storageContainer=$(echo $clusterName | tr "[:upper:]" "[:lower:]")
76+
export workernodeCount=3
77+
export clusterType=kafka
78+
export clusterVersion=4.0
79+
export componentVersion=kafka=2.1
80+
```
81+
82+
1. [Create the resource group](https://docs.microsoft.com/cli/azure/group?view=azure-cli-latest#az-group-create) by entering the command below:
83+
84+
```azurecli
85+
az group create \
86+
--location $location \
87+
--name $resourceGroupName
88+
```
89+
90+
1. [Create an Azure Storage account](https://docs.microsoft.com/cli/azure/storage/account?view=azure-cli-latest#az-storage-account-create) by entering the command below:
91+
92+
```azurecli
93+
# Note: kind BlobStorage is not available as the default storage account.
94+
az storage account create \
95+
--name $storageAccount \
96+
--resource-group $resourceGroupName \
97+
--https-only true \
98+
--kind StorageV2 \
99+
--location $location \
100+
--sku Standard_LRS
101+
```
102+
103+
1. [Extract the primary key](https://docs.microsoft.com/cli/azure/storage/account/keys?view=azure-cli-latest#az-storage-account-keys-list) from the Azure Storage account and store it in a variable by entering the command below:
104+
105+
```azurecli
106+
export storageAccountKey=$(az storage account keys list \
107+
--account-name $storageAccount \
108+
--resource-group $resourceGroupName \
109+
--query [0].value -o tsv)
110+
```
111+
112+
1. [Create an Azure Storage container](https://docs.microsoft.com/cli/azure/storage/container?view=azure-cli-latest#az-storage-container-create) by entering the command below:
113+
114+
```azurecli
115+
az storage container create \
116+
--name $storageContainer \
117+
--account-key $storageAccountKey \
118+
--account-name $storageAccount
119+
```
120+
121+
1. [Create the HDInsight cluster](https://docs.microsoft.com/cli/azure/hdinsight?view=azure-cli-latest#az-hdinsight-create). Before entering the command, note the following parameters:
122+
123+
1. Required parameters for Kafka clusters:
124+
125+
|Parameter | Description|
126+
|---|---|
127+
|--type|The value must be **Kafka**.|
128+
|--workernode-data-disks-per-node|The number of data disks to use per worker node. HDInsight Kafka is only supported with data disks. This tutorial uses a value of **2**.|
129+
130+
1. Required parameters for Kafka REST proxy:
131+
132+
|Parameter | Description|
133+
|---|---|
134+
|--kafka-management-node-size|The size of the node. This tutorial uses the value **Standard_D4_v2**.|
135+
|--kafka-client-group-id|The client AAD security group id for Kafka Rest Proxy. The value is passed from the variable **$securityGroupID**.|
136+
|--kafka-client-group-name|The client AAD security group name for Kafka Rest Proxy. The value is passed from the variable **$securityGroupName**.|
137+
|--version|The HDInsight cluster version must be at least 4.0. The value is passed from the variable **$clusterVersion**.|
138+
|--component-version|The Kafka version must be at least 2.1. The value is passed from the variable **$componentVersion**.|
139+
140+
If you would like to create the cluster without REST proxy, eliminate `--kafka-management-node-size`, `--kafka-client-group-id`, and `--kafka-client-group-name` from the `az hdinsight create` command.
141+
142+
1. If you have an existing virtual network, add the parameters `--vnet-name` and `--subnet`, and their values.
143+
144+
Enter the following command to create the cluster:
145+
146+
```azurecli
147+
az hdinsight create \
148+
--name $clusterName \
149+
--resource-group $resourceGroupName \
150+
--type $clusterType \
151+
--component-version $componentVersion \
152+
--http-password $httpPassword \
153+
--http-user admin \
154+
--location $location \
155+
--ssh-password $sshPassword \
156+
--ssh-user sshuser \
157+
--storage-account $storageAccount \
158+
--storage-account-key $storageAccountKey \
159+
--storage-container $storageContainer \
160+
--version $clusterVersion \
161+
--workernode-count $workernodeCount \
162+
--workernode-data-disks-per-node 2 \
163+
--kafka-management-node-size "Standard_D4_v2" \
164+
--kafka-client-group-id $securityGroupID \
165+
--kafka-client-group-name "$securityGroupName"
166+
```
167+
168+
It may take several minutes for the cluster creation process to complete. Usually around 15.
169+
170+
## Clean up resources
171+
172+
After you complete the article, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it isn't in use. You're also charged for an HDInsight cluster, even when it's not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they aren't in use.
173+
174+
Enter all or some of the following commands to remove resources:
175+
176+
```azurecli
177+
# Remove cluster
178+
az hdinsight delete \
179+
--name $clusterName \
180+
--resource-group $resourceGroupName
181+
182+
# Remove storage container
183+
az storage container delete \
184+
--account-name $storageAccount \
185+
--name $storageContainer
186+
187+
# Remove storage account
188+
az storage account delete \
189+
--name $storageAccount \
190+
--resource-group $resourceGroupName
191+
192+
# Remove resource group
193+
az group delete \
194+
--name $resourceGroupName
195+
```
196+
197+
## Next steps
198+
199+
Now that you've successfully created an Apache Kafka REST proxy enabled cluster in Azure HDInsight using Azure CLI, use Python code to interact with the REST proxy:
200+
201+
> [!div class="nextstepaction"]
202+
> [Create sample application](./rest-proxy.md#client-application-sample)

0 commit comments

Comments
 (0)