Skip to content

Commit a49da56

Browse files
authored
Merge pull request #112950 from dagiro/freshness_c52
freshness_c52
2 parents fac6bf5 + cba8447 commit a49da56

File tree

1 file changed

+12
-12
lines changed

1 file changed

+12
-12
lines changed

articles/hdinsight/storm/apache-storm-develop-java-topology.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ ms.author: hrasheed
66
ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
9-
ms.date: 03/14/2019
10-
ms.custom: H1Hack27Feb2017,hdinsightactive,hdiseo17may2017
9+
ms.custom: H1Hack27Feb2017,hdinsightactive,hdiseo17may2017,seoapr2020
10+
ms.date: 04/27/2020
1111
---
1212

1313
# Create an Apache Storm topology in Java
1414

15-
Learn how to create a Java-based topology for [Apache Storm](https://storm.apache.org/). Here, you create a Storm topology that implements a word-count application. You use [Apache Maven](https://maven.apache.org/) to build and package the project. Then, you learn how to define the topology using the [Apache Storm Flux](https://storm.apache.org/releases/2.0.0/flux.html) framework.
15+
Learn how to create a Java-based topology for Apache Storm. You create a Storm topology that implements a word-count application. You use Apache Maven to build and package the project. Then, you learn how to define the topology using the Apache Storm Flux framework.
1616

1717
After completing the steps in this document, you can deploy the topology to Apache Storm on HDInsight.
1818

@@ -192,7 +192,7 @@ This section is used to add plug-ins, resources, and other build configuration o
192192

193193
* **Apache Maven Compiler Plugin**
194194

195-
Another useful plug-in is the [Apache Maven Compiler Plugin](https://maven.apache.org/plugins/maven-compiler-plugin/), which is used to change compilation options. Change the Java version that Maven uses for the source and target for your application.
195+
Another useful plug-in is the [`Apache Maven Compiler Plugin`](https://maven.apache.org/plugins/maven-compiler-plugin/), which is used to change compilation options. Change the Java version that Maven uses for the source and target for your application.
196196

197197
* For HDInsight __3.4 or earlier__, set the source and target Java version to __1.7__.
198198

@@ -234,13 +234,13 @@ A Java-based Apache Storm topology consists of three components that you must au
234234

235235
* **Spouts**: Reads data from external sources and emits streams of data into the topology.
236236

237-
* **Bolts**: Performs processing on streams emitted by spouts or other bolts, and emits one or more streams.
237+
* **Bolts**: Does processing on streams emitted by spouts or other bolts, and emits one or more streams.
238238

239239
* **Topology**: Defines how the spouts and bolts are arranged, and provides the entry point for the topology.
240240

241241
### Create the spout
242242

243-
To reduce requirements for setting up external data sources, the following spout simply emits random sentences. It's a modified version of a spout that is provided with the [Storm-Starter examples](https://github.com/apache/storm/blob/0.10.x-branch/examples/storm-starter/src/jvm/storm/starter). Although this topology uses only one spout, others may have several that feed data from different sources into the topology.
243+
To reduce requirements for setting up external data sources, the following spout simply emits random sentences. It's a modified version of a spout that is provided with the [Storm-Starter examples](https://github.com/apache/storm/blob/0.10.x-branch/examples/storm-starter/src/jvm/storm/starter). Although this topology uses one spout, others may have several that feed data from different sources into the topology`.`
244244

245245
Enter the command below to create and open a new file `RandomSentenceSpout.java`:
246246

@@ -476,7 +476,7 @@ public class WordCount extends BaseBasicBolt {
476476

477477
### Define the topology
478478

479-
The topology ties the spouts and bolts together into a graph, which defines how data flows between the components. It also provides parallelism hints that Storm uses when creating instances of the components within the cluster.
479+
The topology ties the spouts and bolts together into a graph. The graph defines how data flows between the components. It also provides parallelism hints that Storm uses when creating instances of the components within the cluster.
480480

481481
The following image is a basic diagram of the graph of components for this topology.
482482

@@ -608,15 +608,15 @@ As it runs, the topology displays startup information. The following text is an
608608
17:33:27 [Thread-30-count] INFO com.microsoft.example.WordCount - Emitting a count of 57 for word dwarfs
609609
17:33:27 [Thread-12-count] INFO com.microsoft.example.WordCount - Emitting a count of 57 for word snow
610610

611-
This example log indicates that the word 'and' has been emitted 113 times. The count continues to go up as long as the topology runs because the spout continuously emits the same sentences.
611+
This example log indicates that the word 'and' has been emitted 113 times. The count continues to increase as long as the topology runs. This increase is because the spout continuously emits the same sentences.
612612

613613
There's a 5-second interval between emission of words and counts. The **WordCount** component is configured to only emit information when a tick tuple arrives. It requests that tick tuples are only delivered every five seconds.
614614

615615
## Convert the topology to Flux
616616

617-
[Flux](https://storm.apache.org/releases/2.0.0/flux.html) is a new framework available with Storm 0.10.0 and higher, which allows you to separate configuration from implementation. Your components are still defined in Java, but the topology is defined using a YAML file. You can package a default topology definition with your project, or use a standalone file when submitting the topology. When submitting the topology to Storm, you can use environment variables or configuration files to populate values in the YAML topology definition.
617+
[Flux](https://storm.apache.org/releases/2.0.0/flux.html) is a new framework available with Storm 0.10.0 and higher. Flux allows you to separate configuration from implementation. Your components are still defined in Java, but the topology is defined using a YAML file. You can package a default topology definition with your project, or use a standalone file when submitting the topology. When submitting the topology to Storm, use environment variables or configuration files to populate YAML topology definition values.
618618

619-
The YAML file defines the components to use for the topology and the data flow between them. You can include a YAML file as part of the jar file or you can use an external YAML file.
619+
The YAML file defines the components to use for the topology and the data flow between them. You can include a YAML file as part of the jar file. Or you can use an external YAML file.
620620

621621
For more information on Flux, see [Flux framework (https://storm.apache.org/releases/current/flux.html)](https://storm.apache.org/releases/current/flux.html).
622622

@@ -813,7 +813,7 @@ For more information on these and other features of the Flux framework, see [Flu
813813
814814
## Trident
815815
816-
[Trident](https://storm.apache.org/releases/current/Trident-API-Overview.html) is a high-level abstraction that is provided by Storm. It supports stateful processing. The primary advantage of Trident is that it can guarantee that every message that enters the topology is processed only once. Without using Trident, your topology can only guarantee that messages are processed at least once. There are also other differences, such as built-in components that can be used instead of creating bolts. In fact, bolts are replaced by less-generic components, such as filters, projections, and functions.
816+
[Trident](https://storm.apache.org/releases/current/Trident-API-Overview.html) is a high-level abstraction that is provided by Storm. It supports stateful processing. The primary advantage of Trident is that it guarantees that every message that enters the topology is processed only once. Without using Trident, your topology can only guarantee that messages are processed at least once. There are also other differences, such as built-in components that can be used instead of creating bolts. Bolts are replaced by less-generic components, such as filters, projections, and functions.
817817
818818
Trident applications can be created by using Maven projects. You use the same basic steps as presented earlier in this article—only the code is different. Trident also can't (currently) be used with the Flux framework.
819819
@@ -825,6 +825,6 @@ You've learned how to create an Apache Storm topology by using Java. Now learn h
825825
826826
* [Deploy and manage Apache Storm topologies on HDInsight](apache-storm-deploy-monitor-topology-linux.md)
827827
828-
* [Develop C# topologies for Apache Storm on HDInsight using Visual Studio](apache-storm-develop-csharp-visual-studio-topology.md)
828+
* [Develop topologies using Python](apache-storm-develop-python-topology.md)
829829
830830
You can find more example Apache Storm topologies by visiting [Example topologies for Apache Storm on HDInsight](apache-storm-example-topology.md).

0 commit comments

Comments
 (0)