Skip to content

Commit 01fefc4

Browse files
authored
Update tutorial-automl.md
My task is to remove all Azure Synapse for Apache Spark 2.4 runtime content and AutoMl uses 2.4 runtime which has been disabled.
1 parent 68d7238 commit 01fefc4

File tree

1 file changed

+0
-136
lines changed

1 file changed

+0
-136
lines changed
Lines changed: 0 additions & 136 deletions
Original file line numberDiff line numberDiff line change
@@ -1,137 +1 @@
1-
---
2-
title: 'Tutorial: Train a model by using automated machine learning (deprecated)'
3-
description: Tutorial on how to train a machine learning model without code in Azure Synapse Analytics (deprecated).
4-
ms.service: azure-synapse-analytics
5-
ms.subservice: machine-learning
6-
ms.topic: tutorial
7-
ms.reviewer: whhender, garye
8-
ms.date: 03/06/2024
9-
author: midesa
10-
ms.author: midesa
11-
---
121

13-
# Tutorial: Train a machine learning model without code (deprecated)
14-
15-
You can enrich your data in Spark tables with new machine learning models that you train by using [automated machine learning](/azure/machine-learning/concept-automated-ml). In Azure Synapse Analytics, you can select a Spark table in the workspace to use as a training dataset for building machine learning models, and you can do this in a code-free experience.
16-
17-
In this tutorial, you learn how to train machine learning models by using a code-free experience in Synapse Studio. Synapse Studio is a feature of Azure Synapse Analytics.
18-
19-
You'll use automated machine learning in Azure Machine Learning, instead of coding the experience manually. The type of model that you train depends on the problem you're trying to solve. For this tutorial, you'll use a regression model to predict taxi fares from the New York City taxi dataset.
20-
21-
If you don't have an Azure subscription, [create a free account before you begin](https://azure.microsoft.com/free/).
22-
23-
> [!WARNING]
24-
> - Effective September 29, 2023, Azure Synapse will discontinue official support for [Spark 2.4 Runtimes](../spark/apache-spark-24-runtime.md). Post September 29, 2023, we will not be addressing any support tickets related to Spark 2.4. There will be no release pipeline in place for bug or security fixes for Spark 2.4. Utilizing Spark 2.4 post the support cutoff date is undertaken at one's own risk. We strongly discourage its continued use due to potential security and functionality concerns.
25-
> - As part of the deprecation process for Apache Spark 2.4, we would like to notify you that AutoML in Azure Synapse Analytics will also be deprecated. This includes both the low code interface and the APIs used to create AutoML trials through code.
26-
> - Please note that AutoML functionality was exclusively available through the Spark 2.4 runtime.
27-
> - For customers who wish to continue leveraging AutoML capabilities, we recommend saving your data into your Azure Data Lake Storage Gen2 (ADLSg2) account. From there, you can seamlessly access the AutoML experience through Azure Machine Learning (AzureML). Further information regarding this workaround is available [here](../machine-learning/access-data-from-aml.md).
28-
>
29-
30-
## Prerequisites
31-
32-
- An [Azure Synapse Analytics workspace](../get-started-create-workspace.md). Ensure that it has an Azure Data Lake Storage Gen2 storage account configured as the default storage. For the Data Lake Storage Gen2 file system that you work with, ensure that you're the *Storage Blob Data Contributor*.
33-
- An Apache Spark pool (version 2.4) in your Azure Synapse Analytics workspace. For details, see [Quickstart: Create a serverless Apache Spark pool using Synapse Studio](../quickstart-create-apache-spark-pool-studio.md).
34-
- An Azure Machine Learning linked service in your Azure Synapse Analytics workspace. For details, see [Quickstart: Create a new Azure Machine Learning linked service in Azure Synapse Analytics](quickstart-integrate-azure-machine-learning.md).
35-
36-
## Sign in to the Azure portal
37-
38-
Sign in to the [Azure portal](https://portal.azure.com/).
39-
40-
## Create a Spark table for the training dataset
41-
42-
For this tutorial, you need a Spark table. The following notebook creates one:
43-
44-
1. Download the notebook [Create-Spark-Table-NYCTaxi- Data.ipynb](https://github.com/Azure-Samples/Synapse/blob/ec6faf976d580b793548a4e137b71a0c7e0d287a/MachineLearning/Create%20Spark%20Table%20with%20NYC%20Taxi%20Data.ipynb).
45-
46-
1. Import the notebook to Synapse Studio.
47-
48-
![Screenshot of Azure Synapse Analytics, with the Import option highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-00a.png)
49-
50-
1. Select the Spark pool that you want to use, and then select **Run all**. This step gets New York taxi data from the open dataset and saves the data to your default Spark database.
51-
52-
![Screenshot of Azure Synapse Analytics, with Run all and Spark database highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-00b.png)
53-
54-
1. After the notebook run has completed, you see a new Spark table under the default Spark database. From **Data**, find the table named **nyc_taxi**.
55-
56-
![Screenshot of the Azure Synapse Analytics Data tab, with the new table highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-00c.png)
57-
58-
## Open the automated machine learning wizard
59-
60-
To open the wizard, right-click the Spark table that you created in the previous step. Then select **Machine Learning** > **Train a new model**.
61-
62-
![Screenshot of the Spark table, with Machine Learning and Train a new model highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-00d.png)
63-
64-
## Choose a model type
65-
66-
Select the machine learning model type for the experiment, based on the question you're trying to answer. Because the value you’re trying to predict is numeric (taxi fares), select **Regression** here. Then select **Continue**.
67-
68-
![Screenshot of Train a new model, with Regression highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00b.png)
69-
70-
## Configure the experiment
71-
72-
1. Provide configuration details for creating an automated machine learning experiment run in Azure Machine Learning. This run trains multiple models. The best model from a successful run is registered in the Azure Machine Learning model registry.
73-
74-
![Screenshot of configuration specifications for training a machine learning model.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00a.png)
75-
76-
- **Azure Machine Learning workspace**: An Azure Machine Learning workspace is required for creating an automated machine learning experiment run. You also need to link your Azure Synapse Analytics workspace with the Azure Machine Learning workspace by using a [linked service](quickstart-integrate-azure-machine-learning.md). After you've fulfilled all the prerequisites, you can specify the Azure Machine Learning workspace that you want to use for this automated run.
77-
78-
- **Experiment name**: Specify the experiment name. When you submit an automated machine learning run, you provide an experiment name. Information for the run is stored under that experiment in the Azure Machine Learning workspace. This experience creates a new experiment by default and generates a proposed name, but you can also provide the name of an existing experiment.
79-
80-
- **Best model name**: Specify the name of the best model from the automated run. The best model is given this name and saved in the Azure Machine Learning model registry automatically after this run. An automated machine learning run creates many machine learning models. Based on the primary metric that you select in a later step, those models can be compared and the best model can be selected.
81-
82-
- **Target column**: This is what the model will be trained to predict. Choose the column in the dataset that contains the data you want to predict. For this tutorial, select the numeric column `fareAmount` as the target column.
83-
84-
- **Spark pool**: Specify the Spark pool that you want to use for the automated experiment run. The computations are run on the pool that you specify.
85-
86-
- **Spark configuration details**: In addition to the Spark pool, you have the option to provide session configuration details.
87-
88-
1. Select **Continue**.
89-
90-
## Configure the model
91-
92-
Because you selected **Regression** as your model type in the previous section, the following configurations are available (these are also available for the **Classification** model type):
93-
94-
- **Primary metric**: Enter the metric that measures how well the model is doing. You use this metric to compare different models created in the automated run and determine which model performed best.
95-
96-
- **Training job time (hours)**: Specify the maximum amount of time, in hours, for an experiment to run and train models. Note that you can also provide values less than 1 (for example, **0.5**).
97-
98-
- **Max concurrent iterations**: Choose the maximum number of iterations that run in parallel.
99-
100-
- **ONNX model compatibility**: If you enable this option, the models trained by automated machine learning are converted to the ONNX format. This is particularly relevant if you want to use the model for scoring in Azure Synapse Analytics SQL pools.
101-
102-
These settings all have a default value that you can customize.
103-
104-
![Screenshot of additional configurations for configuring a regression model.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00c1.png)
105-
106-
## Start a run
107-
108-
After all the required configurations are done, you can start your automated run. You can choose to create a run directly by selecting **Create run** - this starts the run without code. Alternatively, if you prefer code, you can select **Open in notebook** - this opens a notebook containing the code that creates the run so you can view the code and start the run yourself.
109-
110-
![Screenshot of 'create run' or 'open in notebook' options.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00c2.png)
111-
112-
>[!NOTE]
113-
>If you selected **Time series forecasting** as your model type in the previous section, you must make additional configurations. Forecasting also doesn't support ONNX model compatibility.
114-
115-
### Create a run directly
116-
117-
To start your automated machine learning run directly, select **Create Run**. You see a notification that indicates the run is starting. Then you see another notification that indicates success. You can also check the status in Azure Machine Learning by selecting the link in the notification.
118-
119-
![Screenshot of successful notification.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00d.png)
120-
121-
### Create a run with a notebook
122-
123-
To generate a notebook, select **Open In Notebook**. This gives you an opportunity to add settings or otherwise modify the code for your automated machine learning run. When you're ready to run the code, select **Run all**.
124-
125-
![Screenshot of a notebook, with Run all highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00e.png)
126-
127-
## Monitor the run
128-
129-
After you've successfully submitted the run, you see a link to the experiment run in the Azure Machine Learning workspace in the notebook output. Select the link to monitor your automated run in Azure Machine Learning.
130-
131-
![Screenshot of Azure Synapse Analytics with a link highlighted.](media/tutorial-automl-wizard/tutorial-automl-wizard-configure-run-00f.png)
132-
133-
## Next steps
134-
135-
- [Tutorial: Machine learning model scoring wizard (preview) for dedicated SQL pools](tutorial-sql-pool-model-scoring-wizard.md)
136-
- [Quickstart: Create a new Azure Machine Learning linked service in Azure Synapse Analytics](quickstart-integrate-azure-machine-learning.md)
137-
- [Machine learning capabilities in Azure Synapse Analytics](what-is-machine-learning.md)

0 commit comments

Comments
 (0)