You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/spark/spark-dotnet.md
+35-4Lines changed: 35 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,18 +23,43 @@ You can analyze data with .NET for Apache Spark through Spark batch job definiti
23
23
24
24
Visit the tutorial to learn how to use Azure Synapse Analytics to [create Apache Spark job definitions for Synapse Spark pools](apache-spark-job-definitions.md). If you haven't packaged your app to submit to Azure Synapse, complete the following steps.
25
25
26
-
1. Run the following commands to publish your app. Be sure to replace *mySparkApp* with the path to your app.
26
+
1. Configure your `dotnet` application dependencies for compatibility with Synapse Spark.
27
+
The required .NET Spark version will be noted in the Synapse Studio interface under your Apache Spark Pool configuration, under the Manage toolbox.
28
+
29
+
:::image type="content" source="./media/apache-spark-job-definitions/net-spark-workspace-compatibility.png" alt-text="Screenshot that shows properties, including the .NET Spark version.":::
30
+
31
+
Create your project as a .NET console application that outputs an Ubuntu x86 executable.
2. Zip the contents of the publish folder, `publish.zip` for example, that was created as a result of Step 1. All the assemblies should be in the first layer of the ZIP file and there should be no intermediate folder layer. This means when you unzip `publish.zip`, all assemblies are extracted into your current working directory.
55
+
3. Zip the contents of the publish folder, `publish.zip` for example, that was created as a result of Step 1. All the assemblies should be in the root of the ZIP file and there should be no intermediate folder layer. This means when you unzip `publish.zip`, all assemblies are extracted into your current working directory.
34
56
35
57
**On Windows:**
36
58
37
-
Use an extraction program, like [7-Zip](https://www.7-zip.org/) or [WinZip](https://www.winzip.com/), to extract the file into the bin directory with all the published binaries.
59
+
Using Windows PowerShell or PowerShell 7, create a .zip from the contents of your publish directory.
60
+
```PowerShell
61
+
Compress-Archive publish/* publish.zip -Update
62
+
```
38
63
39
64
**On Linux:**
40
65
@@ -48,7 +73,7 @@ Visit the tutorial to learn how to use Azure Synapse Analytics to [create Apache
48
73
49
74
Notebooks are a great option for prototyping your .NET for Apache Spark pipelines and scenarios. You can start working with, understanding, filtering, displaying, and visualizing your data quickly and efficiently.
50
75
51
-
Data engineers, data scientists, business analysts, and machine learning engineers are all able to collaborate over a shared, interactive document. You see immediate results from data exploration, and can visualize your data in the same notebook.
76
+
Data engineers, data scientists, business analysts, and machine learning engineers are all able to collaborate over a shared, interactive document. You see immediate results from data exploration, and can visualize your data in the same notebook.
52
77
53
78
### How to use .NET for Apache Spark notebooks
54
79
@@ -79,6 +104,12 @@ The following features are available when you use .NET for Apache Spark in the A
79
104
* Support for defining [.NET user-defined functions that can run within Apache Spark](/dotnet/spark/how-to-guides/udf-guide). We recommend [Write and call UDFs in .NET for Apache Spark Interactive environments](/dotnet/spark/how-to-guides/dotnet-interactive-udf-issue) for learning how to use UDFs in .NET for Apache Spark Interactive experiences.
80
105
* Support for visualizing output from your Spark jobs using different charts (such as line, bar, or histogram) and layouts (such as single, overlaid, and so on) using the `XPlot.Plotly` library.
81
106
* Ability to include NuGet packages into your C# notebook.
107
+
## Troubleshooting
108
+
109
+
### `DotNetRunner: null` / `Futures timeout` in Synapse Spark Job Definition Run
110
+
Synapse Spark Job Definitions on Spark Pools using Spark 2.4 require `Microsoft.Spark` 1.0.0. Clear your `bin` and `obj` directories, and publish the project using 1.0.0.
111
+
### OutOfMemoryError: java heap space at org.apache.spark...
112
+
Dotnet Spark 1.0.0 uses a different debug architecture than 1.1.1+. You will have to use 1.0.0 for your published version and 1.1.1+ for local debugging.
0 commit comments