You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/apache-spark-azure-ml-concepts.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ ms.custom: cliv2, sdkv2
17
17
# Apache Spark in Azure Machine Learning (preview)
18
18
The Azure Machine Learning integration with Azure Synapse Analytics (preview) provides easy access to distributed computing, using the Apache Spark framework. This integration offers these Apache Spark computing experiences:
19
19
- Managed (Automatic) Spark compute
20
-
2. Attached Synapse Spark pool
20
+
- Attached Synapse Spark pool
21
21
22
22
## Managed (Automatic) Spark compute
23
23
Azure Machine Learning Managed (Automatic) Spark compute is the easiest way to execute distributed computing tasks in the Azure Machine Learning environment, using the Apache Spark framework. Azure Machine Learning users can use a fully managed, serverless, on-demand Apache Spark compute cluster. Those users can avoid the need to create an Azure Synapse Workspace and an Azure Synapse Spark pool. Users can define the resources, including
@@ -34,22 +34,22 @@ to access the Managed (Automatic) Spark compute in Azure Machine Learning Notebo
34
34
### Some points to consider
35
35
Managed (Automatic) Spark compute works well for most user scenarios that require quick access to distributed computing using Apache Spark. To make an informed decision, however, users should consider the advantages and disadvantages of this approach.
36
36
37
-
> ### Advantages
38
-
>
39
-
> - No dependencies on other Azure resources to be created for Apache Spark
40
-
> - No permissions required in the subscription to create Synapse-related resources
41
-
> - No need for SQL pool quota
42
-
43
-
> ### Disadvantages
44
-
>
45
-
> - Persistent Hive metastore is missing. Therefore, Managed (Automatic) Spark compute only supports in-memory Spark SQL
46
-
> - No available tables or databases
47
-
> - Missing Purview integration
48
-
> - Linked Services not available
49
-
> - Fewer Data sources/connectors
50
-
> - Missing pool-level configuration
51
-
> - Missing pool-level library management
52
-
> - Partial support for `mssparkutils`
37
+
### Advantages
38
+
39
+
- No dependencies on other Azure resources to be created for Apache Spark
40
+
- No permissions required in the subscription to create Synapse-related resources
41
+
- No need for SQL pool quota
42
+
43
+
### Disadvantages
44
+
45
+
- Persistent Hive metastore is missing. Therefore, Managed (Automatic) Spark compute only supports in-memory Spark SQL
46
+
- No available tables or databases
47
+
- Missing Purview integration
48
+
- Linked Services not available
49
+
- Fewer Data sources/connectors
50
+
- Missing pool-level configuration
51
+
- Missing pool-level library management
52
+
- Partial support for `mssparkutils`
53
53
54
54
### Network configuration
55
55
As of January 2023, the Managed (Automatic) Spark compute doesn't support managed VNet or private endpoint creation to Azure Synapse.
@@ -101,4 +101,4 @@ This [quickstart guide](./quickstart-spark-jobs.md) describes how to start using
101
101
-[Interactive Data Wrangling with Apache Spark in Azure Machine Learning (preview)](./interactive-data-wrangling-with-apache-spark-azure-ml.md)
102
102
-[Submit Spark jobs in Azure Machine Learning (preview)](./how-to-submit-spark-jobs.md)
103
103
-[Code samples for Spark jobs using Azure Machine Learning CLI](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/spark)
104
-
-[Code samples for Spark jobs using Azure Machine Learning Python SDK](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/spark)
104
+
-[Code samples for Spark jobs using Azure Machine Learning Python SDK](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/spark)
0 commit comments