You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/apache-spark-azure-ml-concepts.md
+17-4Lines changed: 17 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ ms.custom: cliv2, sdkv2
16
16
17
17
# Apache Spark in Azure Machine Learning (preview)
18
18
The Azure Machine Learning integration with Azure Synapse Analytics (preview) provides easy access to distributed computing, using the Apache Spark framework. This integration offers these Apache Spark computing experiences:
19
-
1. Managed (Automatic) Spark compute
19
+
- Managed (Automatic) Spark compute
20
20
2. Attached Synapse Spark pool
21
21
22
22
## Managed (Automatic) Spark compute
@@ -34,9 +34,22 @@ to access the Managed (Automatic) Spark compute in Azure Machine Learning Notebo
34
34
### Some points to consider
35
35
Managed (Automatic) Spark compute works well for most user scenarios that require quick access to distributed computing using Apache Spark. To make an informed decision, however, users should consider the advantages and disadvantages of this approach.
36
36
37
-
|Advantages|Disadvantages|
38
-
|----------|-------------|
39
-
|<ul><li>No dependencies on other Azure resources to be created for Apache Spark.</li><li>No permissions required in the subscription to create Synapse-related resources.</li><li>No need for SQL pool quota.</li></ul>|<ul><li>Persistent Hive metastore is missing. Therefore, Managed (Automatic) Spark compute only supports in-memory Spark SQL.<ul><li>No available tables or databases.</li><li>Missing Purview integration.</li></ul><li>Linked Services not available.</li><li>Fewer Data sources/connectors.</li><li>Missing pool-level configuration.</li><li>Missing pool-level library management.</li><li>Partial support for `mssparkutils`.</li></ul>|
37
+
> ### Advantages
38
+
>
39
+
> - No dependencies on other Azure resources to be created for Apache Spark
40
+
> - No permissions required in the subscription to create Synapse-related resources
41
+
> - No need for SQL pool quota
42
+
43
+
> ### Disadvantages
44
+
>
45
+
> - Persistent Hive metastore is missing. Therefore, Managed (Automatic) Spark compute only supports in-memory Spark SQL
46
+
> - No available tables or databases
47
+
> - Missing Purview integration
48
+
> - Linked Services not available
49
+
> - Fewer Data sources/connectors
50
+
> - Missing pool-level configuration
51
+
> - Missing pool-level library management
52
+
> - Partial support for `mssparkutils`
40
53
41
54
### Network configuration
42
55
As of January 2023, the Managed (Automatic) Spark compute doesn't support managed VNet or private endpoint creation to Azure Synapse.
0 commit comments