You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/spark/spark-dotnet.md
+38-26Lines changed: 38 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,50 +5,62 @@ author: mamccrea
5
5
services: synapse-analytics
6
6
ms.service: synapse-analytics
7
7
ms.topic: conceptual
8
-
ms.date: 10/21/2019
8
+
ms.date: 04/10/2020
9
9
ms.author: mamccrea
10
10
ms.reviewer: jrasnick
11
11
---
12
12
13
-
<!--# Use .NET for Apache Spark with Azure Synapse Analytics
13
+
# Use .NET for Apache Spark with Azure Synapse Analytics
14
14
15
-
Azure Synapse Analytics uses Spark pools (preview) for data processing. Apache Spark is a general-purpose distributed processing engine for analytics over large data sets - typically terabytes or petabytes of data. You can use Apache Spark for several popular big data scenarios, including:
15
+
[.NET for Apache Spark](https://dot.net/spark) is free, open-source, and cross-platform .NET support for Spark. .NET for Apache Spark provides .NET bindings for Spark which allow you to access Spark APIs through C# and F#. With .NET for Apache Spark, you have the ability to write and execute user-defined functions for Spark using .NET. The .NET APIs for Spark enable you to access all aspects of Spark that help you analyze your data, including Spark SQL and Structured Streaming.
16
16
17
-
* Batch processing
18
-
* Machine Learning
19
-
* Impromptu querying -->
17
+
You can analyze data with .NET for Apache Spark through Spark batch job definitions or with interactive Azure Synapse Analytics notebooks. In this article, you learn how to use .NET for Apache Spark with Azure Synapse using both techniques.
20
18
21
-
#What is .NET for Apache Spark?
19
+
## Submit batch jobs using the Spark job definition
22
20
23
-
[.NET for Apache Spark](https://dot.net/spark) provides free, open-source, and cross-platform .NET support for Spark. .NET for Apache Spark provides .NET bindings for Spark that allow you to access Spark APIs through C# and F# and gives you the ability to write and execute user-defined functions for Spark using .NET.
21
+
Visit the tutorial to learn how to use Azure Synapse Analytics to [create Apache Spark job definitions for Synapse Spark pools](apache-spark-job-definitions.md). If you have not packaged your app to submit to Azure Synapse, complete the following steps.
24
22
25
-
The .NET APIs for Spark enable you to access all aspects of Spark that help you analyze your data, including Spark SQL and Structured Streaming.
23
+
1. Run the following commands to publish your app. Be sure to replace *mySparkApp* with the path to your app.
26
24
27
-
## .NET for Apache Spark in Azure Synapse Analytics
25
+
**On Windows:**
28
26
29
-
You can analyze your data using .NET for Apache Spark through either Spark batch job definitions or with interactive Azure Synapse Analytics notebooks.
When creating a new notebook, you choose a language kernel that you wish to express your business logic. There is kernel support for several languages, including C#.
39
+
2. Do the following tasks to zip your published app files so that you can easily upload them to Azure Synapse.
38
40
39
-
To use .NET for Apache Spark in your Azure Synapse Analytics notebook, select **.NET Spark (C#)** as your kernel and attach the notebook to an existing Spark pool.
41
+
**On Windows:**
42
+
43
+
Navigate to *mySparkApp/bin/Release/netcoreapp3.0/ubuntu.16.04-x64*. Then, right-click on **Publish** folder and select **Send to > Compressed (zipped) folder**. Name the new folder **publish.zip**.
44
+
45
+
**On Linux, run the following command:**
40
46
41
-
The .NET Spark notebook is based on the .NET interactive experiences and provides interactive C# experiences with the ability to use .NET for Spark out of the box (with the Spark session variable `spark` already predefined). For more details on the available notebook capabilities [see below](#sparknet-c-kernel-features).
47
+
```bash
48
+
zip -r publish.zip
49
+
```
42
50
43
-
## .NET for Apache Spark scenarios
51
+
## .NET for Apache Spark in Azure Synapse Analytics notebooks
44
52
45
-
Notebooks are a great option for prototyping your .NET for Apache Spark pipelines and scenarios. You can start working with, understanding, filtering, displaying, and visualizing your data quickly and efficiently. Data engineers, data scientists, business analysts, and machine learning engineers are all able to collaborate over a shared, highly interactive document. You see immediate results from data exploration, and can visualize your data in the same notebook.
53
+
Notebooks are a great option for prototyping your .NET for Apache Spark pipelines and scenarios. You can start working with, understanding, filtering, displaying, and visualizing your data quickly and efficiently. Data engineers, data scientists, business analysts, and machine learning engineers are all able to collaborate over a shared, interactive document. You see immediate results from data exploration, and can visualize your data in the same notebook.
46
54
47
-
Azure Synapse Analytics notebooks provide a smooth tooling experience with minimal setup, and allow for quick prototyping of big data queries in C# as you learn and practice solving your problems with Apache Spark.
55
+
### How to use notebooks
56
+
57
+
When you create a new notebook, you choose a language kernel that you wish to express your business logic. There is kernel support for several languages, including C#.
58
+
59
+
To use .NET for Apache Spark in your Azure Synapse Analytics notebook, select **.NET Spark (C#)** as your kernel and attach the notebook to an existing Spark pool.
48
60
49
-
You can also develop a complete big data experience, such as reading in data, transforming it, and then exploring it through printed text or visualizing it through a plot or chart.
61
+
The .NET Spark notebook is based on the .NET interactive experiences and provides interactive C# experiences with the ability to use .NET for Spark out of the box with the Spark session variable `spark` already predefined.
50
62
51
-
## Spark.NET C# kernel features
63
+
###Spark.NET C# kernel features
52
64
53
65
The following features are available when you use .NET for Apache Spark in the Azure Synapse Analytics notebook:
54
66
@@ -64,6 +76,6 @@ The following features are available when you use .NET for Apache Spark in the A
64
76
65
77
## Next steps
66
78
67
-
-[.NET for Apache Spark documentation](https://docs.microsoft.com/dotnet/spark)
0 commit comments