Skip to content

Commit b670059

Browse files
committed
Merge branch 'release-ignite-arcadia' of https://github.com/MicrosoftDocs/azure-docs-pr into release-ignite-arcadia
2 parents e80a9cc + 3ac8da4 commit b670059

16 files changed

+82
-94
lines changed

articles/synapse-analytics/security/synapse-workspace-managed-identity.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ In this article, you'll learn about managed identity in Azure Synapse workspace.
1717

1818
Managed identity for Azure resources is a feature of Azure Active Directory. The feature provides Azure services with an automatically managed identity in Azure AD. You can use the Managed Identity capability to authenticate to any service that support Azure AD authentication.
1919

20-
Managed identities for Azure resources are the new name for the service formerly known as Managed Service Identity (MSI). See [Managed Identities](https://docs.microsoft.com/azure/active-directory/managed-identities-azure-resources/overview) to learn more.
20+
Managed identities for Azure resources are the new name for the service formerly known as Managed Service Identity (MSI). See [Managed Identities](../../active-directory/managed-identities-azure-resources/overview.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json) to learn more.
2121

2222
## Azure Synapse workspace managed identity
2323

articles/synapse-analytics/spark/apache-spark-development-using-notebooks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -387,5 +387,5 @@ Using the following keystroke shortcuts, you can more easily navigate and run co
387387

388388
## Next steps
389389

390-
- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
390+
- [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
391391
- [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)

articles/synapse-analytics/spark/apache-spark-history-server.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -233,5 +233,6 @@ Input/output data using Resilient Distributed Datasets (RDDs) does not show in d
233233

234234
## Next steps
235235

236-
- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
237-
- [Azure Synapse Analytics](../overview-what-is.md)
236+
- [Azure Synapse Analytics](../overview-what-is.md)
237+
- [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
238+

articles/synapse-analytics/spark/apache-spark-job-definitions.md

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -18,29 +18,31 @@ This tutorial demonstrates how to use the Azure Synapse Analytics to create Spar
1818
* View job details after submission.
1919

2020
In this tutorial, you learn how to:
21+
2122
> [!div class="checklist"]
23+
>
2224
> * Develop and submit a Spark job definition on a Synapse Spark pool.
2325
> * View job details after submission.
2426
2527
## Prerequisites
2628

27-
* An Azure Synapse Analytics workspace. For instructions, see [Create an Azure Synapse Analytics workspace](https://docs.microsoft.com/azure/machine-learning/how-to-manage-workspace#create-a-workspace).
29+
* An Azure Synapse Analytics workspace. For instructions, see [Create an Azure Synapse Analytics workspace](../../machine-learning/how-to-manage-workspace.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json#create-a-workspace).
2830

2931
## Get started
3032

31-
Before submitting a Spark job definition, you need to be the Storage Blob Data Owner of the ADLS Gen2 filesystem you want to work with. If you aren't, you need to add the permission manually.
33+
Before submitting a Spark job definition, you need to be the Storage Blob Data Owner of the ADLS Gen2 filesystem you want to work with. If you aren't, you need to add the permission manually.
3234

3335
### Scenario 1: Add permission
3436

35-
1. Open [Microsoft Azure](https://ms.portal.azure.com), then open Storage account.
37+
1. Open [Microsoft Azure](https://ms.portal.azure.com), then open Storage account.
3638

3739
2. Click **Containers**, then create a **File system**. This tutorial uses `sparkjob`.
3840

3941
![Click submit button to submit spark job definition](./media/apache-spark-job-definitions/open-azure-container.png)
4042

41-
![The Spark Submission dialog box](./media/apache-spark-job-definitions/create-new-filesystem.png)
43+
![The Spark Submission dialog box](./media/apache-spark-job-definitions/create-new-filesystem.png)
4244

43-
3. Open `sparkjob`, click **Access Control(IAM)**, then click **Add** and select **Add role assignment**.
45+
3. Open `sparkjob`, click **Access Control(IAM)**, then click **Add** and select **Add role assignment**.
4446

4547
![Click submit button to submit spark job definition](./media/apache-spark-job-definitions/add-role-assignment-01.png)
4648

@@ -50,7 +52,6 @@ Before submitting a Spark job definition, you need to be the Storage Blob Data O
5052

5153
![Click submit button to submit spark job definition](./media/apache-spark-job-definitions/verify-user-role.png)
5254

53-
5455
### Scenario 2: Prepare folder structure
5556

5657
Before submitting a Spark job definition, one job you need to do is uploading files to ADLS Gen2 and preparing folder structure there. We use Storage node in Synapse Studio to store files.
@@ -86,7 +87,7 @@ Before submitting a Spark job definition, one job you need to do is uploading fi
8687
|Main class name| The fully qualified identifier or the main class that is in the main definition file.|
8788
|Command-line arguments| Optional arguments to the job.|
8889
|Reference files| Additional files used for reference in the main definition file. You can select **Upload file** to upload the file to a storage account.|
89-
|Spark pool| The job will be submitted to the selected Spark pool.|
90+
|Spark pool| The job will be submitted to the selected Spark pool.|
9091
|Spark version| Version of Spark that the Spark pool is running.|
9192
|Executors| Number of executors to be given in the specified Spark pool for the job.|
9293
|Executor size| Number of cores and memory to be used for executors given in the specified Spark pool for the job.|
@@ -102,7 +103,7 @@ Before submitting a Spark job definition, one job you need to do is uploading fi
102103
|Main definition file| The main file used for the job. Select a PY file from your storage. You can select **Upload file** to upload the file to a storage account.|
103104
|Command-line arguments| Optional arguments to the job.|
104105
|Reference files| Additional files used for reference in the main definition file. You can select **Upload file** to upload the file to a storage account.|
105-
|Spark pool| The job will be submitted to the selected Spark pool.|
106+
|Spark pool| The job will be submitted to the selected Spark pool.|
106107
|Spark version| Version of Spark that the Spark pool is running.|
107108
|Executors| Number of executors to be given in the specified Spark pool for the job.|
108109
|Executor size| Number of cores and memory to be used for executors given in the specified Spark pool for the job.|
@@ -119,7 +120,7 @@ Before submitting a Spark job definition, one job you need to do is uploading fi
119120
|Main executable file| The main executable file in the main definition ZIP file.|
120121
|Command-line arguments| Optional arguments to the job.|
121122
|Reference files| Additional files needed by the worker nodes for executing the .NET for Spark application that isn't included in the main definition ZIP file(that is, dependent jars, additional user-defined function DLLs, and other config files). You can select **Upload file** to upload the file to a storage account.|
122-
|Spark pool| The job will be submitted to the selected Spark pool.|
123+
|Spark pool| The job will be submitted to the selected Spark pool.|
123124
|Spark version| Version of Spark that the Spark pool is running.|
124125
|Executors| Number of executors to be given in the specified Spark pool for the job.|
125126
|Executor size| Number of cores and memory to be used for executors given in the specified Spark pool for the job.|
@@ -135,18 +136,17 @@ Before submitting a Spark job definition, one job you need to do is uploading fi
135136

136137
After creating a Spark job definition, you can submit it to a Synapse Spark pool. Make sure you've gone through steps in **Get-started** before trying samples in this part.
137138

138-
139-
### Scenario 1: Submit Spark job definition
139+
### Scenario 1: Submit Spark job definition
140140

141141
1. Open a spark job definition window by clicking it.
142142

143-
![Open spark job definition to submit ](./media/apache-spark-job-definitions/open-spark-definition.png)
143+
![Open spark job definition to submit ](./media/apache-spark-job-definitions/open-spark-definition.png)
144144

145145
2. Click **submit** icon to submit your project to the selected Spark Pool. You can click **Spark monitoring URL** tab to see the LogQuery of the Spark application.
146146

147147
![Click submit button to submit spark job definition](./media/apache-spark-job-definitions/submit-spark-definition.png)
148148

149-
![The Spark Submission dialog box](./media/apache-spark-job-definitions/submit-definition-result.png)
149+
![The Spark Submission dialog box](./media/apache-spark-job-definitions/submit-definition-result.png)
150150

151151
### Scenario 2: View Spark job running progress
152152

@@ -160,15 +160,13 @@ After creating a Spark job definition, you can submit it to a Synapse Spark pool
160160

161161
### Scenario 3: Check output file
162162

163-
1. Click **Data**, then select **Storage accounts**. After a successful run, you can go to the ADLS Gen2 storage and check outputs are generated.
163+
1. Click **Data**, then select **Storage accounts**. After a successful run, you can go to the ADLS Gen2 storage and check outputs are generated.
164164

165165
![View output file](./media/apache-spark-job-definitions/view-output-file.png)
166166

167-
168-
169167
## Next steps
170168

171169
This tutorial demonstrated how to use the Azure Synapse Analytics to create Spark job definitions, and then submit them to a Synapse Spark pool. Next you can use Azure Synapse Analytics to create Power BI datasets and manage Power BI data.
172170

173171
- [Connect to data in Power BI Desktop](https://docs.microsoft.com/power-bi/desktop-quickstart-connect-to-data)
174-
- [Visualize with Power BI](/sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi)
172+
- [Visualize with Power BI](../sql-data-warehouse/sql-data-warehouse-get-started-visualize-with-power-bi.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)

articles/synapse-analytics/spark/apache-spark-machine-learning-mllib-notebook.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -291,9 +291,9 @@ After you have finished running the application, shut down the notebook to relea
291291

292292
## Next steps
293293

294-
- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
294+
- [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
295295
- [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)
296296
- [Apache Spark official documentation](https://spark.apache.org/docs/latest/)
297297

298298
>[!NOTE]
299-
> Some of the official Apache Spark documentation relies on using the Spark console, which is not available on Azure Synapse Spark. Use the [notebook](../spark/apache-spark-notebook-create-spark-use-sql.md) or [IntelliJ](../spark/intellij-tool-synapse.md) experiences instead.
299+
> Some of the official Apache Spark documentation relies on using the Spark console, which is not available on Azure Synapse Spark. Use the [notebook](../spark/apache-spark-notebook-create-spark-use-sql.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json) or [IntelliJ](../spark/intellij-tool-synapse.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json) experiences instead.

articles/synapse-analytics/spark/apache-spark-notebook-create-spark-use-sql.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,8 +128,8 @@ To ensure the Spark instance is shut down, end any connected sessions(notebooks)
128128

129129
In this quickstart, you learned how to create a Synapse Analytics Apache Spark pool and run a basic Spark SQL query.
130130

131-
- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
132131
- [Azure Synapse Analytics](../overview-what-is.md)
132+
- [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
133133
- [Apache Spark official documentation](https://spark.apache.org/docs/latest/)
134134

135135
>[!NOTE]

articles/synapse-analytics/spark/apache-spark-what-is-delta-lake.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,5 +36,5 @@ For more information, see [Delta Lake Project](https://lfprojects.org).
3636

3737
## Next steps
3838

39-
- [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
39+
- [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
4040
- [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)

articles/synapse-analytics/spark/intellij-tool-synapse.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ In this tutorial, you learn how to:
2828
## Prerequisites
2929

3030
- [IntelliJ IDEA Community Version](https://www.jetbrains.com/idea/download/download-thanks.html?platform=windows&code=IIC).
31-
- Azure toolkit plugin 3.27.0-2019.2 – Install from [IntelliJ Plugin repository](https://docs.microsoft.com/java/azure/intellij/azure-toolkit-for-intellij-installation?view=azure-java-stable)
31+
- Azure toolkit plugin 3.27.0-2019.2 – Install from [IntelliJ Plugin repository](/java/azure/intellij/azure-toolkit-for-intellij-installation?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
3232
- [JDK (Version 1.8)](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html).
33-
- Scala Plugin – Install from [IntelliJ Plugin repository](https://docs.microsoft.com/azure/hdinsight/spark/apache-spark-intellij-tool-plugin#install-scala-plugin-for-intellij-idea).
33+
- Scala Plugin – Install from [IntelliJ Plugin repository](/hdinsight/spark/apache-spark-intellij-tool-plugin#install-scala-plugin-for-intellij-idea.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json).
3434
- This prerequisite is only for Windows users.
3535

3636
While you're running the local Spark Scala application on a Windows computer, you might get an exception, as explained in [SPARK-2356](https://issues.apache.org/jira/browse/SPARK-2356). The exception occurs because WinUtils.exe is missing on Windows.
@@ -138,7 +138,7 @@ After creating a Scala application, you can remotely run it.
138138
|Main class name|The default value is the main class from the selected file. You can change the class by selecting the ellipsis(**...**) and choosing another class.|
139139
|Job configurations|You can change the default key and values. For more information, see [Apache Livy REST API](https://livy.incubator.apache.org./docs/latest/rest-api.html).|
140140
|Command line arguments|You can enter arguments separated by space for the main class if needed.|
141-
|Referenced Jars and Referenced Files|You can enter the paths for the referenced Jars and files if any. You can also browse files in the Azure virtual file system, which currently only supports ADLS Gen 2 cluster. For more information: [Apache Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment) and [How to upload resources to cluster](https://docs.microsoft.com/azure/storage/blobs/storage-quickstart-blobs-storage-explorer).|
141+
|Referenced Jars and Referenced Files|You can enter the paths for the referenced Jars and files if any. You can also browse files in the Azure virtual file system, which currently only supports ADLS Gen 2 cluster. For more information: [Apache Spark Configuration](https://spark.apache.org/docs/latest/configuration.html#runtime-environment) and [How to upload resources to cluster](../../storage/blobs/storage-quickstart-blobs-storage-explorer.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json).|
142142
|Job Upload Storage|Expand to reveal additional options.|
143143
|Storage Type|Select **Use Azure Blob to upload** from the drop-down list.|
144144
|Storage Account|Enter your storage account.|

articles/synapse-analytics/spark/spark-dotnet.md

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -32,18 +32,11 @@ Visit the tutorial to learn how to use Azure Synapse Analytics to [create Apache
3232

3333
**On Linux:**
3434

35-
```bash
36-
cd mySparkApp
37-
foo@bar:~/path/to/app$ dotnet publish -c Release -f netcoreapp3.0 -r ubuntu.16.04-x64
38-
```
39-
40-
2. Do the following tasks to zip your published app files so that you can easily upload them to Azure Synapse.
41-
42-
**On Windows:**
35+
### .NET for Apache Spark in Azure Synapse Analytics notebooks
4336

44-
Navigate to *mySparkApp/bin/Release/netcoreapp3.0/ubuntu.16.04-x64*. Then, right-click on **Publish** folder and select **Send to > Compressed (zipped) folder**. Name the new folder **publish.zip**.
37+
When creating a new notebook, you choose a language kernel that you wish to express your business logic. There is kernel support for several languages, including C#.
4538

46-
**On Linux, run the following command:**
39+
To use .NET for Apache Spark in your Azure Synapse Analytics notebook, select **.NET Spark (C#)** as your kernel and attach the notebook to an existing Spark pool.
4740

4841
```bash
4942
zip -r publish.zip
@@ -69,14 +62,14 @@ The following features are available when you use .NET for Apache Spark in the A
6962
* Simple C# statements (such as assignments, printing to console, throwing exceptions, and so on).
7063
* Multi-line C# code blocks (such as if statements, foreach loops, class definitions, and so on).
7164
* Access to the standard C# library (such as System, LINQ, Enumerables, and so on).
72-
* Support for [C# 8.0 language features](https://docs.microsoft.com/dotnet/csharp/whats-new/csharp-8).
65+
* Support for [C# 8.0 language features](/dotnet/csharp/whats-new/csharp-8?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json).
7366
* 'spark' as a pre-defined variable to give you access to your Apache Spark session.
7467
* Support for defining [.NET user-defined functions that can run within Apache Spark](https://github.com/dotnet/spark/blob/master/examples/Microsoft.Spark.CSharp.Examples/Sql).
75-
* Support for visualizing output from your Spark jobs using different charts (such as line, bar, or histogram) and layouts (such as single, overlaid, and so on) using the `XPlot.Plotly` library.
68+
* Support for visualizing output from your Spark jobs using different charts (such as line, bar, or histogram) and layouts (such as single, overlaid, and so on) using the `XPlot.Plotly` library.
7669
* Ability to include NuGet packages into your C# notebook.
7770

7871
## Next steps
7972

80-
* [.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
73+
* [.NET for Apache Spark documentation](/dotnet/spark?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json)
8174
* [Azure Synapse Analytics](https://docs.microsoft.com/azure/synapse-analytics)
8275
* [.NET Interactive](https://devblogs.microsoft.com/dotnet/creating-interactive-net-documentation/)

articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-continuous-integration-and-deployment.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ At this point, you have a simple environment where any check-in to your source c
4040

4141
## Continuous deployment with the Azure SQL Data Warehouse (or Database) deployment task
4242

43-
1. Add a new task using the [Azure SQL Database deployment task](https://docs.microsoft.com/azure/devops/pipelines/targets/azure-sqldb?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json) and fill in the required fields to connect to your target data warehouse. When this task runs, the DACPAC generated from the previous build process is deployed to the target data warehouse. You can also use the [Azure SQL Data Warehouse deployment task](https://marketplace.visualstudio.com/items?itemName=ms-sql-dw.SQLDWDeployment).
43+
1. Add a new task using the [Azure SQL Database deployment task](/devops/pipelines/targets/azure-sqldb?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json) and fill in the required fields to connect to your target data warehouse. When this task runs, the DACPAC generated from the previous build process is deployed to the target data warehouse. You can also use the [Azure SQL Data Warehouse deployment task](https://marketplace.visualstudio.com/items?itemName=ms-sql-dw.SQLDWDeployment).
4444

4545
![Deployment Task](./media/sql-data-warehouse-continuous-integration-and-deployment/4-deployment-task.png "Deployment Task")
4646

0 commit comments

Comments
 (0)