|
| 1 | +--- |
| 2 | +title: 'Create and monitor Apache Storm topology in Azure HDInsight' |
| 3 | +description: In the quickstart, learn how to create and monitor an Apache Storm topology in Azure HDInsight. |
| 4 | +author: hrasheed-msft |
| 5 | +ms.reviewer: jasonh |
| 6 | + |
| 7 | +ms.service: hdinsight |
| 8 | +ms.topic: quickstart |
| 9 | +ms.date: 06/14/2019 |
| 10 | +ms.author: hrasheed |
| 11 | +ms.custom: mvc |
| 12 | + |
| 13 | +#Customer intent: I want to learn how to create Apache Storm topologies and deploy them to a Storm cluster in Azure HDInsight. |
| 14 | +--- |
| 15 | + |
| 16 | +# Quickstart: Create and monitor an Apache Storm topology in Azure HDInsight |
| 17 | + |
| 18 | +Apache Storm is a scalable, fault-tolerant, distributed, real-time computation system for processing streams of data. With Storm on Azure HDInsight, you can create a cloud-based Storm cluster that performs big data analytics in real time. |
| 19 | + |
| 20 | +In this quickstart, you use an example from the Apache [storm-starter](https://github.com/apache/storm/tree/v2.0.0/examples/storm-starter) project to create and monitor an Apache Storm topology to an existing Apache Storm cluster. |
| 21 | + |
| 22 | +## Prerequisites |
| 23 | + |
| 24 | +* An Apache Storm cluster on HDInsight. See [Create Apache Hadoop clusters using the Azure portal](../hdinsight-hadoop-create-linux-clusters-portal.md) and select **Storm** for **Cluster type**. |
| 25 | + |
| 26 | +* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md). |
| 27 | + |
| 28 | +## Create the topology |
| 29 | + |
| 30 | +1. Connect to your Storm cluster. Edit the command below by replacing `CLUSTERNAME` with the name of your Storm cluster, and then enter the command: |
| 31 | + |
| 32 | + ```cmd |
| 33 | + |
| 34 | + ``` |
| 35 | +
|
| 36 | +2. The **WordCount** example is included on your HDInisght cluster at `/usr/hdp/current/storm-client/contrib/storm-starter/`. The topology generates random sentences and counts how many times words occur. Use the following command to start the **wordcount** topology on the cluster: |
| 37 | +
|
| 38 | + ```bash |
| 39 | + storm jar /usr/hdp/current/storm-client/contrib/storm-starter/storm-starter-topologies-*.jar org.apache.storm.starter.WordCountTopology wordcount |
| 40 | + ``` |
| 41 | +
|
| 42 | +## Monitor the topology |
| 43 | +
|
| 44 | +Storm provides a web interface for working with running topologies, and is included on your HDInsight cluster. |
| 45 | +
|
| 46 | +Use the following steps to monitor the topology using the Storm UI: |
| 47 | +
|
| 48 | +1. To display the Storm UI, open a web browser to `https://CLUSTERNAME.azurehdinsight.net/stormui`. Replace `CLUSTERNAME` with the name of your cluster. |
| 49 | +
|
| 50 | +2. Under **Topology Summary**, select the **wordcount** entry in the **Name** column. Information about the topology is displayed. |
| 51 | +
|
| 52 | +  |
| 53 | +
|
| 54 | + The new page provides the following information: |
| 55 | +
|
| 56 | + |Property | Description | |
| 57 | + |---|---| |
| 58 | + |Topology stats|Basic information on the topology performance, organized into time windows. Selecting a specific time window changes the time window for information displayed in other sections of the page.| |
| 59 | + |Spouts|Basic information about spouts, including the last error returned by each spout.| |
| 60 | + |Bolts|Basic information about bolts.| |
| 61 | + |Topology configuration|Detailed information about the topology configuration.| |
| 62 | + |Activate|Resumes processing of a deactivated topology.| |
| 63 | + |Deactivate|Pauses a running topology.| |
| 64 | + |Rebalance|Adjusts the parallelism of the topology. You should rebalance running topologies after you have changed the number of nodes in the cluster. Rebalancing adjusts parallelism to compensate for the increased/decreased number of nodes in the cluster. For more information, see [Understanding the parallelism of an Apache Storm topology](https://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html).| |
| 65 | + |Kill|Terminates a Storm topology after the specified timeout.| |
| 66 | +
|
| 67 | +3. From this page, select an entry from the **Spouts** or **Bolts** section. Information about the selected component is displayed. |
| 68 | +
|
| 69 | +  |
| 70 | +
|
| 71 | + The new page displays the following information: |
| 72 | +
|
| 73 | + |Property | Description | |
| 74 | + |---|---| |
| 75 | + |Spout/Bolt stats|Basic information on the component performance, organized into time windows. Selecting a specific time window changes the time window for information displayed in other sections of the page.| |
| 76 | + |Input stats (bolt only)|Information on components that produce data consumed by the bolt.| |
| 77 | + |Output stats|Information on data emitted by this bolt.| |
| 78 | + |Executors|Information on instances of this component.| |
| 79 | + |Errors|Errors produced by this component.| |
| 80 | +
|
| 81 | +4. When viewing the details of a spout or bolt, select an entry from the **Port** column in the **Executors** section to view details for a specific instance of the component. |
| 82 | +
|
| 83 | + 2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: split default ["with"] |
| 84 | + 2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: split default ["nature"] |
| 85 | + 2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [snow] |
| 86 | + 2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [snow, 747293] |
| 87 | + 2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [white] |
| 88 | + 2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [white, 747293] |
| 89 | + 2015-01-27 14:18:02 b.s.d.executor [INFO] Processing received message source: split:21, stream: default, id: {}, [seven] |
| 90 | + 2015-01-27 14:18:02 b.s.d.task [INFO] Emitting: count default [seven, 1493957] |
| 91 | +
|
| 92 | + In this example, the word **seven** has occurred 1493957 times. This count is how many times the word has been encountered since this topology was started. |
| 93 | +
|
| 94 | +## Stop the topology |
| 95 | +
|
| 96 | +Return to the **Topology summary** page for the word-count topology, and then select the **Kill** button from the **Topology actions** section. When prompted, enter 10 for the seconds to wait before stopping the topology. After the timeout period, the topology no longer appears when you visit the **Storm UI** section of the dashboard. |
| 97 | +
|
| 98 | +## Clean up resources |
| 99 | +
|
| 100 | +After you complete the quickstart, you may want to delete the cluster. With HDInsight, your data is stored in Azure Storage, so you can safely delete a cluster when it is not in use. You are also charged for an HDInsight cluster, even when it is not in use. Since the charges for the cluster are many times more than the charges for storage, it makes economic sense to delete clusters when they are not in use. |
| 101 | +
|
| 102 | +To delete a cluster, see [Delete an HDInsight cluster using your browser, PowerShell, or the Azure CLI](../hdinsight-delete-cluster.md). |
| 103 | +
|
| 104 | +## Next steps |
| 105 | +
|
| 106 | +In this quickstart, you used an example from the Apache [storm-starter](https://github.com/apache/storm/tree/v2.0.0/examples/storm-starter) project to create and monitor an Apache Storm topology to an existing Apache Storm cluster. Advance to the next article to learn the basics of managing and monitoring Apache Storm topologies. |
| 107 | +
|
| 108 | +> [!div class="nextstepaction"] |
| 109 | +>[Deploy and manage Apache Storm topologies on Azure HDInsight ](./apache-storm-deploy-monitor-topology-linux.md) |
0 commit comments