Skip to content

Commit eec80e3

Browse files
authored
Merge pull request #79753 from dagiro/mvc20
mvc20
2 parents 2e962d9 + a92106f commit eec80e3

File tree

4 files changed

+114
-0
lines changed

4 files changed

+114
-0
lines changed

articles/hdinsight/storm/TOC.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,11 @@
44
items:
55
- name: What is Apache Storm in HDInsight?
66
href: apache-storm-overview.md
7+
- name: Quickstarts
8+
expanded: true
9+
items:
10+
- name: Create Apache Storm topology in HDInsight
11+
href: ./apache-storm-quickstart.md
712
- name: Get started
813
items:
914
- name: Create an Apache Storm cluster
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
---
2+
title: 'Create and monitor Apache Storm topology in Azure HDInsight'
3+
description: In the quickstart, learn how to create and monitor an Apache Storm topology in Azure HDInsight.
4+
author: hrasheed-msft
5+
ms.reviewer: jasonh
6+
7+
ms.service: hdinsight
8+
ms.topic: quickstart
9+
ms.date: 06/14/2019
10+
ms.author: hrasheed
11+
ms.custom: mvc
12+
13+
#Customer intent: I want to learn how to create Apache Storm topologies and deploy them to a Storm cluster in Azure HDInsight.
14+
---
15+
16+
# Quickstart: Create and monitor an Apache Storm topology in Azure HDInsight
17+
18+
Apache Storm is a scalable, fault-tolerant, distributed, real-time computation system for processing streams of data. With Storm on Azure HDInsight, you can create a cloud-based Storm cluster that performs big data analytics in real time.
19+
20+
In this quickstart, you use an example from the Apache [storm-starter](https://github.com/apache/storm/tree/v2.0.0/examples/storm-starter) project to create and monitor an Apache Storm topology to an existing Apache Storm cluster.
21+
22+
## Prerequisites
23+
24+
* An Apache Storm cluster on HDInsight. See [Create Apache Hadoop clusters using the Azure portal](../hdinsight-hadoop-create-linux-clusters-portal.md) and select **Storm** for **Cluster type**.
25+
26+
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
27+
28+
## Create the topology
29+
30+
1. Connect to your Storm cluster. Edit the command below by replacing `CLUSTERNAME` with the name of your Storm cluster, and then enter the command:
31+
32+
```cmd
33+
34+
```
35+
36+
2. The **WordCount** example is included on your HDInisght cluster at `/usr/hdp/current/storm-client/contrib/storm-starter/`. The topology generates random sentences and counts how many times words occur. Use the following command to start the **wordcount** topology on the cluster:
37+
38+
```bash
39+
storm jar /usr/hdp/current/storm-client/contrib/storm-starter/storm-starter-topologies-*.jar org.apache.storm.starter.WordCountTopology wordcount
40+
```
41+
42+
## Monitor the topology
43+
44+
Storm provides a web interface for working with running topologies, and is included on your HDInsight cluster.
45+
46+
Use the following steps to monitor the topology using the Storm UI:
47+
48+
1. To display the Storm UI, open a web browser to `https://CLUSTERNAME.azurehdinsight.net/stormui`. Replace `CLUSTERNAME` with the name of your cluster.
49+
50+
2. Under **Topology Summary**, select the **wordcount** entry in the **Name** column. Information about the topology is displayed.
51+
52+
![Storm Dashboard with storm-starter WordCount topology information.](./media/apache-storm-quickstart/topology-summary.png)
53+
54+
The new page provides the following information:
55+
56+
|Property | Description |
57+
|---|---|
58+
|Topology stats|Basic information on the topology performance, organized into time windows. Selecting a specific time window changes the time window for information displayed in other sections of the page.|
59+
|Spouts|Basic information about spouts, including the last error returned by each spout.|
60+
|Bolts|Basic information about bolts.|
61+
|Topology configuration|Detailed information about the topology configuration.|
62+
|Activate|Resumes processing of a deactivated topology.|
63+
|Deactivate|Pauses a running topology.|
64+
|Rebalance|Adjusts the parallelism of the topology. You should rebalance running topologies after you have changed the number of nodes in the cluster. Rebalancing adjusts parallelism to compensate for the increased/decreased number of nodes in the cluster. For more information, see [Understanding the parallelism of an Apache Storm topology](https://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html).|
65+
|Kill|Terminates a Storm topology after the specified timeout.|
66+
67+
3. From this page, select an entry from the **Spouts** or **Bolts** section. Information about the selected component is displayed.
68+
69+
![Storm Dashboard with information about selected components.](./media/apache-storm-quickstart/component-summary.png)
70+
71+
The new page displays the following information:
72+
73+
|Property | Description |
74+
|---|---|
75+
|Spout/Bolt stats|Basic information on the component performance, organized into time windows. Selecting a specific time window changes the time window for information displayed in other sections of the page.|
76+
|Input stats (bolt only)|Information on components that produce data consumed by the bolt.|
77+
|Output stats|Information on data emitted by this bolt.|
78+
|Executors|Information on instances of this component.|
79+
|Errors|Errors produced by this component.|
80+
81+
4. When viewing the details of a spout or bolt, select an entry from the **Port** column in the **Executors** section to view details for a specific instance of the component.
82+
83+
2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: split default ["with"]
84+
2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: split default ["nature"]
85+
2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [snow]
86+
2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [snow, 747293]
87+
2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [white]
88+
2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [white, 747293]
89+
2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [seven]
90+
2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [seven, 1493957]
91+
92+
In this example, the word **seven** has occurred 1493957 times. This count is how many times the word has been encountered since this topology was started.
93+
94+
## Stop the topology
95+
96+
Return to the **Topology summary** page for the word-count topology, and then select the **Kill** button from the **Topology actions** section. When prompted, enter 10 for the seconds to wait before stopping the topology. After the timeout period, the topology no longer appears when you visit the **Storm UI** section of the dashboard.
97+
98+
## Clean up resources
99+
100+
After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. You are also charged for an HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use.
101+
102+
To delete a cluster, see [Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI](../hdinsight-delete-cluster.md).
103+
104+
## Next steps
105+
106+
In this quickstart, you used an example from the Apache [storm-starter](https://github.com/apache/storm/tree/v2.0.0/examples/storm-starter) project to create and monitor an Apache Storm topology to an existing Apache Storm cluster. Advance to the next article to learn the basics of managing and monitoring Apache Storm topologies.
107+
108+
> [!div class="nextstepaction"]
109+
>[Deploy and manage Apache Storm topologies on Azure HDInsight ](./apache-storm-deploy-monitor-topology-linux.md)
46.4 KB
Loading
31.9 KB
Loading

0 commit comments

Comments
 (0)