MicrosoftDocs
diff --git a/‎articles/synapse-analytics/data-integration/data-integration-sql-pool.md
Lines changed: 1 addition & 1 deletion b/‎articles/synapse-analytics/data-integration/data-integration-sql-pool.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/synapse-analytics/data-integration/linked-service.md
Lines changed: 1 addition & 1 deletion b/‎articles/synapse-analytics/data-integration/linked-service.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/synapse-analytics/monitoring/how-to-monitor-pipeline-runs.md
Lines changed: 1 addition & 6 deletions b/‎articles/synapse-analytics/monitoring/how-to-monitor-pipeline-runs.md
Lines changed: 1 addition & 6 deletions
diff --git a/‎articles/synapse-analytics/monitoring/how-to-monitor-spark-applications.md
Lines changed: 1 addition & 7 deletions b/‎articles/synapse-analytics/monitoring/how-to-monitor-spark-applications.md
Lines changed: 1 addition & 7 deletions
diff --git a/‎articles/synapse-analytics/security/how-to-grant-workspace-managed-identity-permissions.md
Lines changed: 1 addition & 1 deletion b/‎articles/synapse-analytics/security/how-to-grant-workspace-managed-identity-permissions.md
Lines changed: 1 addition & 1 deletion
diff --git a/‎articles/synapse-analytics/spark/apache-spark-concepts.md
Lines changed: 11 additions & 2 deletions b/‎articles/synapse-analytics/spark/apache-spark-concepts.md
Lines changed: 11 additions & 2 deletions
diff --git a/‎articles/synapse-analytics/spark/apache-spark-history-server.md
Lines changed: 2 additions & 2 deletions b/‎articles/synapse-analytics/spark/apache-spark-history-server.md
Lines changed: 2 additions & 2 deletions
diff --git a/‎articles/synapse-analytics/spark/apache-spark-job-definitions.md
Lines changed: 4 additions & 1 deletion b/‎articles/synapse-analytics/spark/apache-spark-job-definitions.md
Lines changed: 4 additions & 1 deletion
diff --git a/‎articles/synapse-analytics/spark/apache-spark-notebook-create-spark-use-sql.md
Lines changed: 3 additions & 3 deletions b/‎articles/synapse-analytics/spark/apache-spark-notebook-create-spark-use-sql.md
Lines changed: 3 additions & 3 deletions
diff --git a/‎articles/synapse-analytics/spark/apache-spark-performance.md
Lines changed: 9 additions & 5 deletions b/‎articles/synapse-analytics/spark/apache-spark-performance.md
Lines changed: 9 additions & 5 deletions
@@ -1,6 +1,6 @@
 ---
 title: Ingest into SQL pool in Azure Synapse Analytics 
-description: Learn how to ingest data into a SQL analytics pool in Azure Synapse Analytics
+description: Learn how to ingest data into a SQL pool in Azure Synapse Analytics
 services: synapse-analytics 
 author: djpmsft
 ms.service: synapse-analytics 
 
@@ -63,6 +63,6 @@ You have now established a secure and private connection between Synapse and you
 
 ## Next steps
 
-For more understanding of Managed private endpoint in Synapse, see the [Concept around Synapse Managed private endpoint](data-integration-data-lake.md) article.
+To develop further understanding of Managed private endpoint in Synapse Analytics, see the [Concept around Synapse Managed private endpoint](data-integration-data-lake.md) article.
 
 For more information on data integration for Synapse Analytics, see the [Ingesting data into a Data Lake](data-integration-data-lake.md) article.
@@ -46,9 +46,4 @@ To view details about your pipeline run, select the pipeline run. Then view the
 
 ## Next steps
 
-This article showed you how to monitor pipeline runs in your Azure Synapse workspace. You learned how to:
-
-> [!div class="checklist"]
-> * View the list of pipeline runs in your workspace
-> * Filter the list of pipeline runs to find the pipeline you'd like to monitor
-> * Monitor your selected pipeline run in detail.
+To learn more about monitoring applications, see the [Monitor Apache Spark applications](how-to-monitor-spark-applications.md) article. 
@@ -52,10 +52,4 @@ To view the details about one of your Spark applications, select the Spark appli
 
 ## Next steps
 
-This article showed you how to monitor Spark applications in your Azure Synapse workspace. You learned how to:
-
-> [!div class="checklist"]
->
-> * View the list of Spark applications in your workspace
-> * Filter the list of Spark applications to find the Spark applications you'd like to monitor
-> * Monitor your selected Spark application in detail.
+For more information on monitoring pipeline runs, see the [Monitor pipeline runs Azure Synapse Studio](how-to-monitor-pipeline-runs.md) article.  
@@ -114,4 +114,4 @@ You should see your managed identity listed under the **Storage Blob Data Contri
 
 ## Next steps
 
-[Workspace managed identity](./synapse-workspace-managed-identity.md)
+Learn more about [Workspace managed identity](./synapse-workspace-managed-identity.md)
@@ -13,7 +13,9 @@ ms.reviewer: euang
 
 # Apache Spark in Azure Synapse Analytics Core Concepts
 
-Apache Spark is a parallel processing framework that supports in-memory processing to boost the performance of big-data analytic applications. Apache Spark in Azure Synapse Analytics is one of Microsoft's implementations of Apache Spark in the cloud. Azure Synapse makes it easy to create and configure Spark capabilities in Azure. Azure Synapse provides a different implementation of these Spark capabilities that are documented here.
+Apache Spark is a parallel processing framework that supports in-memory processing to boost the performance of big-data analytic applications. Apache Spark in Azure Synapse Analytics is one of Microsoft's implementations of Apache Spark in the cloud. 
+
+Azure Synapse makes it easy to create and configure Spark capabilities in Azure. Azure Synapse provides a different implementation of these Spark capabilities that are documented here.
 
 ## Spark pools (preview)
 
@@ -27,7 +29,9 @@ You can read how to create a Spark pool and see all their properties here [Get s
 
 ## Spark instances
 
-Spark instances are created when you connect to a Spark pool, create a session, and run a job. As multiple users may have access to a single Spark pool, a new Spark instance is created for each user that connects. When you submit a second job, then if there is capacity in the pool, the existing Spark instance also has capacity then the existing instance will process the job; if not and there is capacity at the pool level, then a new Spark instance will be created.
+Spark instances are created when you connect to a Spark pool, create a session, and run a job. As multiple users may have access to a single Spark pool, a new Spark instance is created for each user that connects. 
+
+When you submit a second job, then if there is capacity in the pool, the existing Spark instance also has capacity then the existing instance will process the job; if not and there is capacity at the pool level, then a new Spark instance will be created.
 
 ## Examples
 
@@ -50,3 +54,8 @@ Spark instances are created when you connect to a Spark pool, create a session,
 - You submit a notebook job, J1 that uses 10 nodes, a Spark instance, SI1 is created to process the job.
 - Another user, U2, submits a Job, J3, that uses 10 nodes, a new Spark instance, SI2, is created to process the job.
 - You now submit another job, J2, that uses 10 nodes because there is still capacity in the pool and the instance, J2, is processed by SI1.
+
+## Next steps
+
+- [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)
+- [Apache Spark Documentation](https://spark.apache.org/docs/2.4.4/)
@@ -233,5 +233,5 @@ Input/output data using Resilient Distributed Datasets (RDDs) does not show in d
 
 ## Next steps
 
-* [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
-* [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)
+- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
+- [Azure Synapse Analytics](../overview-what-is.md)
@@ -168,4 +168,7 @@ After creating a Spark job definition, you can submit it to a Synapse Spark pool
 
 ## Next steps
 
-This tutorial demonstrates how to use the Azure Synapse Analytics to create Spark job definitions, and then submit them to a Synapse Spark pool. Next you can use the Azure Synapse Analytics to create Power BI datasets and manage Power BI data.          
+This tutorial demonstrated how to use the Azure Synapse Analytics to create Spark job definitions, and then submit them to a Synapse Spark pool. Next you can use Azure Synapse Analytics to create Power BI datasets and manage Power BI data. 
+
+- [Connect to data in Power BI Desktop](https://docs.microsoft.com/power-bi/desktop-quickstart-connect-to-data)
+- [Visualize with Power BI](/sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi)
@@ -129,8 +129,8 @@ To ensure the Spark instance is shut down, end any connected sessions(notebooks)
 In this quickstart, you learned how to create a Synapse Analytics Apache Spark pool and run a basic Spark SQL query.
 
 - [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
-- [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)
+- [Azure Synapse Analytics](../overview-what-is.md)
 - [Apache Spark official documentation](https://spark.apache.org/docs/latest/)
 
-> [!NOTE]
-> Some of the official Apache Spark documentation relies on using the spark console, this is not available on Azure Synapse Spark, use the notebook or IntelliJ experiences instead
+>[!NOTE]
+> Some of the official Apache Spark documentation relies on using the Spark console, which is not available on Azure Synapse Spark. Use the [notebook](../spark/apache-spark-notebook-create-spark-use-sql.md) or [IntelliJ](../spark/intellij-tool-synapse.md) experiences instead.
@@ -153,22 +153,26 @@ When running concurrent queries, consider the following:
 * Distribute queries across parallel applications.
 * Modify size based both on trial runs and on the preceding factors such as GC overhead.
 
-Monitor your query performance for outliers or other performance issues, by looking at the timeline view, SQL graph, job statistics, and so forth. Sometimes one or a few of the executors are slower than the others, and tasks take much longer to execute. This frequently happens on larger clusters (> 30 nodes). In this case, divide the work into a larger number of tasks so the scheduler can compensate for slow tasks. For example, have at least twice as many tasks as the number of executor cores in the application. You can also enable speculative execution of tasks with `conf: spark.speculation = true`.
+Monitor your query performance for outliers or other performance issues, by looking at the timeline view, SQL graph, job statistics, and so forth. Sometimes one or a few of the executors are slower than the others, and tasks take much longer to execute. This frequently happens on larger clusters (> 30 nodes). In this case, divide the work into a larger number of tasks so the scheduler can compensate for slow tasks. 
+
+For example, have at least twice as many tasks as the number of executor cores in the application. You can also enable speculative execution of tasks with `conf: spark.speculation = true`.
 
 ## Optimize job execution
 
 * Cache as necessary, for example if you use the data twice, then cache it.
 * Broadcast variables to all executors. The variables are only serialized once, resulting in faster lookups.
 * Use the thread pool on the driver, which results in faster operation for many tasks.
 
-Key to Spark 2.x query performance is the Tungsten engine, which depends on whole-stage code generation. In some cases, whole-stage code generation may be disabled. For example, if you use a non-mutable type (`string`) in the aggregation expression, `SortAggregate` appears instead of `HashAggregate`. For example, for better performance, try the following and then re-enable code generation:
+Key to Spark 2.x query performance is the Tungsten engine, which depends on whole-stage code generation. In some cases, whole-stage code generation may be disabled. 
+
+For example, if you use a non-mutable type (`string`) in the aggregation expression, `SortAggregate` appears instead of `HashAggregate`. For example, for better performance, try the following and then re-enable code generation:
 
 ```sql
 MAX(AMOUNT) -> MAX(cast(AMOUNT as DOUBLE))
 ```
 
 ## Next steps
 
-* [Tuning Apache Spark](https://spark.apache.org/docs/latest/tuning.html)
-* [How to Actually Tune Your Apache Spark Jobs So They Work](https://www.slideshare.net/ilganeli/how-to-actually-tune-your-spark-jobs-so-they-work)
-* [Kryo Serialization](https://github.com/EsotericSoftware/kryo)
+- [Tuning Apache Spark](https://spark.apache.org/docs/latest/tuning.html)
+- [How to Actually Tune Your Apache Spark Jobs So They Work](https://www.slideshare.net/ilganeli/how-to-actually-tune-your-spark-jobs-so-they-work)
+- [Kryo Serialization](https://github.com/EsotericSoftware/kryo)
Original file line number	Diff line number	Diff line change
`@@ -114,4 +114,4 @@ You should see your managed identity listed under the **Storage Blob Data Contri`
`114`	`114`
`115`	`115`	`## Next steps`
`116`	`116`
`117`		`-[Workspace managed identity](./synapse-workspace-managed-identity.md)`
	`117`	`+Learn more about [Workspace managed identity](./synapse-workspace-managed-identity.md)`