Merge pull request #112324 from dagiro/freshness_c26

PRMerger8 · web-flow · commit 41404a40a6de · 2020-04-22T10:09:18.000-07:00
freshness_c26
diff --git a/articles/hdinsight/hdinsight-apache-kafka-spark-structured-streaming.md b/articles/hdinsight/hdinsight-apache-kafka-spark-structured-streaming.md
@@ -7,7 +7,7 @@ ms.reviewer: jasonh
 ms.service: hdinsight
 ms.topic: tutorial
 ms.custom: hdinsightactive,seodec18
-ms.date: 03/11/2020
+ms.date: 04/22/2020
 
 #Customer intent: As a developer, I want to learn how to use Spark Structured Streaming with Kafka on HDInsight.
 ---
@@ -32,7 +32,7 @@ When you're done with the steps in this document, remember to delete the cluster
 
 * Familiarity with using [Jupyter Notebooks](https://jupyter.org/) with Spark on HDInsight. For more information, see the [Load data and run queries with Apache Spark on HDInsight](spark/apache-spark-load-data-run-query.md) document.
 
-* Familiarity with the [Scala](https://www.scala-lang.org/) programming language. The code used in this tutorial is written in Scala.
+* Familiarity with the Scala programming language. The code used in this tutorial is written in Scala.
 
 * Familiarity with creating Kafka topics. For more information, see the [Apache Kafka on HDInsight quickstart](kafka/apache-kafka-get-started.md) document.
 
@@ -45,7 +45,7 @@ When you're done with the steps in this document, remember to delete the cluster
 
 ## Structured Streaming with Apache Kafka
 
-Spark Structured Streaming is a stream processing engine built on the Spark SQL engine. When using Structured Streaming, you can write streaming queries the same way that you write batch queries.
+Spark Structured Streaming is a stream processing engine built on the Spark SQL engine. When using Structured Streaming, you can write streaming queries the same way you write batch queries.
 
 The following code snippets demonstrate reading from Kafka and storing to file. The first one is a batch operation, while the second one is a streaming operation:
 
@@ -179,7 +179,7 @@ This example demonstrates how to use Spark Structured Streaming with Kafka on HD
 
 1. Select **New > Spark** to create a notebook.
 
-1. Spark streaming has microbatching, which means data comes as batches and executers run on the batches of data. If the executor has idle timeout less than the time it takes to process the batch, then the executors would be constantly added and removed. If the executors idle timeout is greater than the batch duration, the executor never gets removed. Hence **we recommend that you disable dynamic allocation by setting spark.dynamicAllocation.enabled to false when running streaming applications.**
+1. Spark streaming has microbatching, which means data comes as batches and executers run on the batches of data. If the executor has idle timeout less than the time it takes to process the batch, then the executors would be constantly added and removed. If the executors idle timeout is greater than the batch duration, the executor never gets removed. So **we recommend that you disable dynamic allocation by setting spark.dynamicAllocation.enabled to false when running streaming applications.**
 
     Load packages used by the Notebook by entering the following information in a Notebook cell. Run the command by using **CTRL + ENTER**.
 
@@ -274,7 +274,7 @@ This example demonstrates how to use Spark Structured Streaming with Kafka on HD
     println("Schema declared")
     ```
 
-1. Select data and start the stream. The following command demonstrates how to retrieve data from kafka using a batch query, and then write the results out to HDFS on the Spark cluster. In this example, the `select` retrieves the message (value field) from Kafka and applies the schema to it. The data is then written to HDFS (WASB or ADL) in parquet format. Enter the command in your next Jupyter cell.
+1. Select data and start the stream. The following command demonstrates how to retrieve data from Kafka using a batch query. And then write the results out to HDFS on the Spark cluster. In this example, the `select` retrieves the message (value field) from Kafka and applies the schema to it. The data is then written to HDFS (WASB or ADL) in parquet format. Enter the command in your next Jupyter cell.
 
     ```scala
     // Read a batch from Kafka
@@ -313,7 +313,7 @@ This example demonstrates how to use Spark Structured Streaming with Kafka on HD
 
 ## Clean up resources
 
-To clean up the resources created by this tutorial, you can delete the resource group. Deleting the resource group also deletes the associated HDInsight cluster, and any other resources associated with the resource group.
+To clean up the resources created by this tutorial, you can delete the resource group. Deleting the resource group also deletes the associated HDInsight cluster. And any other resources associated with the resource group.
 
 To remove the resource group using the Azure portal:
 
@@ -328,7 +328,7 @@ To remove the resource group using the Azure portal:
 
 ## Next steps
 
-In this tutorial, you learned how to use [Apache Spark Structured Streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html) to write and read data from [Apache Kafka](./kafka/apache-kafka-introduction.md) on HDInsight. Use the following link to learn how to use [Apache Storm](./storm/apache-storm-overview.md) with Kafka.
+In this tutorial, you learned how to use Apache Spark Structured Streaming. To write and read data from Apache Kafka on HDInsight. Use the following link to learn how to use Apache Storm with Kafka.
 
 > [!div class="nextstepaction"]
 > [Use Apache Storm with Apache Kafka](hdinsight-apache-storm-with-kafka.md)