You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Learn how to create a Java-based topology for [Apache Storm](https://storm.apache.org/). Here, you create a Storm topology that implements a word-count application. You use [Apache Maven](https://maven.apache.org/) to build and package the project. Then, you learn how to define the topology using the [Apache Storm Flux](https://storm.apache.org/releases/2.0.0/flux.html) framework.
15
+
Learn how to create a Java-based topology for Apache Storm. You create a Storm topology that implements a word-count application. You use Apache Maven to build and package the project. Then, you learn how to define the topology using the Apache Storm Flux framework.
16
16
17
17
After completing the steps in this document, you can deploy the topology to Apache Storm on HDInsight.
18
18
@@ -192,7 +192,7 @@ This section is used to add plug-ins, resources, and other build configuration o
192
192
193
193
* **Apache Maven Compiler Plugin**
194
194
195
-
Another useful plug-in is the [Apache Maven Compiler Plugin](https://maven.apache.org/plugins/maven-compiler-plugin/), which is used to change compilation options. Change the Java version that Maven uses for the source and target for your application.
195
+
Another useful plug-in is the [`Apache Maven Compiler Plugin`](https://maven.apache.org/plugins/maven-compiler-plugin/), which is used to change compilation options. Change the Java version that Maven uses for the source and target for your application.
196
196
197
197
* For HDInsight __3.4 or earlier__, set the source and target Java version to __1.7__.
198
198
@@ -234,13 +234,13 @@ A Java-based Apache Storm topology consists of three components that you must au
234
234
235
235
***Spouts**: Reads data from external sources and emits streams of data into the topology.
236
236
237
-
***Bolts**: Performs processing on streams emitted by spouts or other bolts, and emits one or more streams.
237
+
***Bolts**: Does processing on streams emitted by spouts or other bolts, and emits one or more streams.
238
238
239
239
***Topology**: Defines how the spouts and bolts are arranged, and provides the entry point for the topology.
240
240
241
241
### Create the spout
242
242
243
-
To reduce requirements for setting up external data sources, the following spout simply emits random sentences. It's a modified version of a spout that is provided with the [Storm-Starter examples](https://github.com/apache/storm/blob/0.10.x-branch/examples/storm-starter/src/jvm/storm/starter). Although this topology uses only one spout, others may have several that feed data from different sources into the topology.
243
+
To reduce requirements for setting up external data sources, the following spout simply emits random sentences. It's a modified version of a spout that is provided with the [Storm-Starter examples](https://github.com/apache/storm/blob/0.10.x-branch/examples/storm-starter/src/jvm/storm/starter). Although this topology uses one spout, others may have several that feed data from different sources into the topology`.`
244
244
245
245
Enter the command below to create and open a new file `RandomSentenceSpout.java`:
246
246
@@ -476,7 +476,7 @@ public class WordCount extends BaseBasicBolt {
476
476
477
477
### Define the topology
478
478
479
-
The topology ties the spouts and bolts together into a graph, which defines how data flows between the components. It also provides parallelism hints that Storm uses when creating instances of the components within the cluster.
479
+
The topology ties the spouts and bolts together into a graph. The graph defines how data flows between the components. It also provides parallelism hints that Storm uses when creating instances of the components within the cluster.
480
480
481
481
The following image is a basic diagram of the graph of components for this topology.
482
482
@@ -608,15 +608,15 @@ As it runs, the topology displays startup information. The following text is an
608
608
17:33:27 [Thread-30-count] INFO com.microsoft.example.WordCount - Emitting a count of 57 for word dwarfs
609
609
17:33:27 [Thread-12-count] INFO com.microsoft.example.WordCount - Emitting a count of 57 for word snow
610
610
611
-
This example log indicates that the word 'and' has been emitted 113 times. The count continues to go up as long as the topology runs because the spout continuously emits the same sentences.
611
+
This example log indicates that the word 'and' has been emitted 113 times. The count continues to increase as long as the topology runs. This increase is because the spout continuously emits the same sentences.
612
612
613
613
There's a 5-second interval between emission of words and counts. The **WordCount** component is configured to only emit information when a tick tuple arrives. It requests that tick tuples are only delivered every five seconds.
614
614
615
615
## Convert the topology to Flux
616
616
617
-
[Flux](https://storm.apache.org/releases/2.0.0/flux.html) is a new framework available with Storm 0.10.0 and higher, which allows you to separate configuration from implementation. Your components are still defined in Java, but the topology is defined using a YAML file. You can package a default topology definition with your project, or use a standalone file when submitting the topology. When submitting the topology to Storm, you can use environment variables or configuration files to populate values in the YAML topology definition.
617
+
[Flux](https://storm.apache.org/releases/2.0.0/flux.html) is a new framework available with Storm 0.10.0 and higher. Flux allows you to separate configuration from implementation. Your components are still defined in Java, but the topology is defined using a YAML file. You can package a default topology definition with your project, or use a standalone file when submitting the topology. When submitting the topology to Storm, use environment variables or configuration files to populate YAML topology definition values.
618
618
619
-
The YAML file defines the components to use for the topology and the data flow between them. You can include a YAML file as part of the jar file or you can use an external YAML file.
619
+
The YAML file defines the components to use for the topology and the data flow between them. You can include a YAML file as part of the jar file. Or you can use an external YAML file.
620
620
621
621
For more information on Flux, see [Flux framework (https://storm.apache.org/releases/current/flux.html)](https://storm.apache.org/releases/current/flux.html).
622
622
@@ -813,7 +813,7 @@ For more information on these and other features of the Flux framework, see [Flu
813
813
814
814
## Trident
815
815
816
-
[Trident](https://storm.apache.org/releases/current/Trident-API-Overview.html) is a high-level abstraction that is provided by Storm. It supports stateful processing. The primary advantage of Trident is that it can guarantee that every message that enters the topology is processed only once. Without using Trident, your topology can only guarantee that messages are processed at least once. There are also other differences, such as built-in components that can be used instead of creating bolts. In fact, bolts are replaced by less-generic components, such as filters, projections, and functions.
816
+
[Trident](https://storm.apache.org/releases/current/Trident-API-Overview.html) is a high-level abstraction that is provided by Storm. It supports stateful processing. The primary advantage of Trident is that it guarantees that every message that enters the topology is processed only once. Without using Trident, your topology can only guarantee that messages are processed at least once. There are also other differences, such as built-in components that can be used instead of creating bolts. Bolts are replaced by less-generic components, such as filters, projections, and functions.
817
817
818
818
Trident applications can be created by using Maven projects. You use the same basic steps as presented earlier in this article—only the code is different. Trident also can't (currently) be used with the Flux framework.
819
819
@@ -825,6 +825,6 @@ You've learned how to create an Apache Storm topology by using Java. Now learn h
825
825
826
826
* [Deploy and manage Apache Storm topologies on HDInsight](apache-storm-deploy-monitor-topology-linux.md)
827
827
828
-
* [Develop C# topologies for Apache Storm on HDInsight using Visual Studio](apache-storm-develop-csharp-visual-studio-topology.md)
828
+
* [Develop topologies using Python](apache-storm-develop-python-topology.md)
829
829
830
830
You can find more example Apache Storm topologies by visiting [Example topologies for Apache Storm on HDInsight](apache-storm-example-topology.md).
0 commit comments