You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- An Azure subscription; if you don't have an Azure subscription, [create a free account](https://azure.microsoft.com/free) before you begin.
@@ -44,12 +38,6 @@ In this quickstart guide, you'll learn how to submit a Spark job using Azure Mac
44
38
-[Configure your development environment](./how-to-configure-environment.md), or [create an Azure Machine Learning compute instance](./concept-compute-instance.md#create).
45
39
-[Install Azure Machine Learning SDK for Python](/python/api/overview/azure/ai-ml-readme).
46
40
47
-
> [!TIP]
48
-
> You can submit a Spark job from:
49
-
> - an Azure Machine Learning Notebook connected to an Azure Machine Learning compute instance.
50
-
> -[Visual Studio Code connected to an Azure Machine Learning compute instance](./how-to-set-up-vs-code-remote.md?tabs=studio).
51
-
> - your local computer that has [the Azure Machine Learning SDK for Python](/python/api/overview/azure/ai-ml-readme) installed.
52
-
53
41
# [Studio UI](#tab/studio-ui)
54
42
- An Azure subscription; if you don't have an Azure subscription, [create a free account](https://azure.microsoft.com/free) before you begin.
55
43
- An Azure Machine Learning workspace. See [Create workspace resources](./quickstart-create-resources.md).
> This Python code sample uses `pyspark.pandas`, which is only supported by Spark runtime version 3.2.
117
+
> - This Python code sample uses `pyspark.pandas`, which is only supported by Spark runtime version 3.2.
118
+
> - Please ensure that `titanic.py` file is uploaded to a folder named `src`. The `src` folder should be located in the same directory where you have created the Python script/notebook or the YAML specification file defining the standalone Spark job.
130
119
131
120
The above script takes two arguments `--titanic_data` and `--wrangled_data`, which pass the path of input data and output folder respectively. The script uses `titanic.csv` file, which can be [found here](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/spark/data/titanic.csv). This file should be uploaded to the Azure Data Lake Storage (ADLS) Gen 2 storage account.
> -[terminal of an Azure Machine Learning compute instance](./how-to-access-terminal.md#access-a-terminal).
130
+
> - terminal of [Visual Studio Code connected to an Azure Machine Learning compute instance](./how-to-set-up-vs-code-remote.md?tabs=studio).
131
+
> - your local computer that has [the Azure Machine Learning CLI](./how-to-configure-cli.md?tabs=public) installed.
132
+
137
133
This example YAML specification shows a standalone Spark job. It uses an Azure Machine Learning Managed (Automatic) Spark compute, user identity passthrough, and input/output data URI in format `abfss://<FILE_SYSTEM_NAME>@<STORAGE_ACCOUNT_NAME>.dfs.core.windows.net/<PATH_TO_DATA>`:
138
134
139
135
```yaml
@@ -192,6 +188,13 @@ az ml job create --file <YAML_SPECIFICATION_FILE_NAME>.yaml --subscription <SUBS
> - an Azure Machine Learning Notebook connected to an Azure Machine Learning compute instance.
195
+
> - [Visual Studio Code connected to an Azure Machine Learning compute instance](./how-to-set-up-vs-code-remote.md?tabs=studio).
196
+
> - your local computer that has [the Azure Machine Learning SDK for Python](/python/api/overview/azure/ai-ml-readme) installed.
197
+
195
198
This Python code snippet shows the creation of a standalone Spark job, with an Azure Machine Learning Managed (Automatic) Spark compute, user identity passthrough, and input/output data URI in format `abfss://<FILE_SYSTEM_NAME>@<STORAGE_ACCOUNT_NAME>.dfs.core.windows.net/<PATH_TO_DATA>`:
196
199
197
200
```python
@@ -319,4 +322,4 @@ First, upload the parameterized Python code `titanic.py` to the Azure Blob stora
319
322
- [Interactive Data Wrangling with Apache Spark in Azure Machine Learning (preview)](./interactive-data-wrangling-with-apache-spark-azure-ml.md)
320
323
- [Submit Spark jobs in Azure Machine Learning (preview)](./how-to-submit-spark-jobs.md)
321
324
- [Code samples for Spark jobs using Azure Machine Learning CLI](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/spark)
322
-
- [Code samples for Spark jobs using Azure Machine Learning Python SDK](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/spark)
325
+
- [Code samples for Spark jobs using Azure Machine Learning Python SDK](https://github.com/Azure/azureml-examples/tree/main/sdk/python/jobs/spark)
0 commit comments