|
1 | 1 | ---
|
2 |
| -title: How to do hyperparameter sweep in pipeline |
| 2 | +title: How to do hyperparameter sweep in pipelines |
3 | 3 | titleSuffix: Azure Machine Learning
|
4 |
| -description: How to use sweep to do hyperparameter tuning in Azure Machine Learning pipeline using CLI v2 and Python SDK |
| 4 | +description: Learn how to use sweep to automate hyperparameter tuning in Azure Machine Learning pipelines with CLI v2 and Python SDK v2. |
5 | 5 | services: machine-learning
|
6 | 6 | ms.service: azure-machine-learning
|
7 | 7 | ms.subservice: mlops
|
8 | 8 | ms.topic: how-to
|
9 | 9 | author: lgayhardt
|
10 | 10 | ms.author: lagayhar
|
11 | 11 | ms.reviewer: zhanxia
|
12 |
| -ms.date: 05/26/2022 |
| 12 | +ms.date: 09/19/2024 |
13 | 13 | ms.custom: devx-track-python, sdkv2, cliv2, update-code2
|
14 | 14 | ---
|
15 | 15 |
|
16 |
| -# How to do hyperparameter tuning in pipeline (v2) |
| 16 | +# How to do hyperparameter tuning in pipelines |
17 | 17 |
|
18 | 18 | [!INCLUDE [dev v2](includes/machine-learning-dev-v2.md)]
|
19 | 19 |
|
20 |
| -In this article, you'll learn how to do hyperparameter tuning in Azure Machine Learning pipeline. |
| 20 | +In this article, you learn how to automate hyperparameter tuning in Azure Machine Learning pipelines by using Azure Machine Learning CLI v2 or Azure Machine Learning SDK for Python v2. |
21 | 21 |
|
22 |
| -## Prerequisite |
| 22 | +Hyperparameters are adjustable parameters that let you control the model training process. Hyperparameter tuning is the process of finding the configuration of hyperparameters that results in the best performance. Azure Machine Learning lets you automate hyperparameter tuning and [run experiments in parallel](how-to-use-parallel-job-in-pipeline.md) to efficiently optimize hyperparameters. |
23 | 23 |
|
24 |
| -1. Understand what is [hyperparameter tuning](how-to-tune-hyperparameters.md) and how to do hyperparameter tuning in Azure Machine Learning use SweepJob. |
25 |
| -2. Understand what is a [Azure Machine Learning pipeline](concept-ml-pipelines.md) |
26 |
| -3. Build a command component that takes hyperparameter as input. |
| 24 | +## Prerequisites |
27 | 25 |
|
28 |
| -## How to do hyperparameter tuning in Azure Machine Learning pipeline |
| 26 | +- Have an Azure Machine Learning account and workspace. |
| 27 | +- Understand [Azure Machine Learning pipelines](concept-ml-pipelines.md) and [hyperparameter tuning a model](how-to-tune-hyperparameters.md). |
29 | 28 |
|
30 |
| -This section explains how to do hyperparameter tuning in Azure Machine Learning pipeline using CLI v2 and Python SDK. Both approaches share the same prerequisite: you already have a command component created and the command component takes hyperparameters as inputs. If you don't have a command component yet. Follow below links to create a command component first. |
| 29 | +## Create and run a hyperparameter tuning pipeline |
31 | 30 |
|
32 |
| -- [Azure Machine Learning CLI v2](how-to-create-component-pipelines-cli.md) |
33 |
| -- [Azure Machine Learning Python SDK v2](how-to-create-component-pipeline-python.md) |
| 31 | +# [Azure CLI](#tab/cli) |
34 | 32 |
|
35 |
| -### CLI v2 |
| 33 | +The following examples come from [Run a pipeline job using sweep (hyperdrive) in pipeline](https://github.com/Azure/azureml-examples/tree/main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep) in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) repository. For more information about creating pipelines with components, see [Create and run machine learning pipelines using components with the Azure Machine Learning CLI](how-to-create-component-pipelines-cli.md). |
36 | 34 |
|
37 |
| -The example used in this article can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *[azureml-examples/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep* to check the example. |
| 35 | +# [Python SDK](#tab/python) |
38 | 36 |
|
39 |
| -Assume you already have a command component defined in `train.yaml`. A two-step pipeline job (train and predict) YAML file looks like below. |
| 37 | +The following examples come from the [Build pipeline with sweep node](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep/pipeline_with_hyperparameter_sweep.ipynb) notebook in the [Azure Machine Learning examples](https://github.com/Azure/azureml-examples) repository. For more information about creating pipelines with components, see [Create and run machine learning pipelines using components with the Azure Machine Learning SDK v2](how-to-create-component-pipeline-python.md). |
40 | 38 |
|
41 |
| -:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/pipeline.yml" highlight="7-48"::: |
| 39 | +For a related notebook, see [Run hyperparameter sweep on a command job](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/single-step/lightgbm/iris/lightgbm-iris-sweep.ipynb). |
42 | 40 |
|
43 |
| -The `sweep_step` is the step for hyperparameter tuning. Its type needs to be `sweep`. And `trial` refers to the command component defined in `train.yaml`. From the `search space` field we can see three hyparmeters (`c_value`, `kernel`, and `coef`) are added to the search space. After you submit this pipeline job, Azure Machine Learning will run the trial component multiple times to sweep over hyperparameters based on the search space and terminate policy you defined in `sweep_step`. Check [sweep job YAML schema](reference-yaml-job-sweep.md) for full schema of sweep job. |
| 41 | +--- |
| 42 | + |
| 43 | +### Create a command component with hyperparameter inputs |
44 | 44 |
|
45 |
| -Below is the trial component definition (train.yml file). |
| 45 | +The Azure Machine Learning pipeline must have a command component with hyperparameter inputs. The following *train.yml* file from the example projects defines a `trial` component that has the `c_value`, `kernel`, and `coef` hyperparameter inputs and runs the source code that's located in the *./train-src* folder. |
46 | 46 |
|
47 | 47 | :::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/train.yml" highlight="11-16,23-25,60":::
|
48 | 48 |
|
49 |
| -The hyperparameters added to search space in pipeline.yml need to be inputs for the trial component. The source code of the trial component is under `./train-src` folder. In this example, it's a single `train.py` file. This is the code that will be executed in every trial of the sweep job. Make sure you've logged the metrics in the trial component source code with exactly the same name as `primary_metric` value in pipeline.yml file. In this example, we use `mlflow.autolog()`, which is the recommended way to track your ML experiments. See more about mlflow [here](./how-to-use-mlflow-cli-runs.md) |
50 |
| - |
51 |
| -Below code snippet is the source code of trial component. |
| 49 | +### Create the trial component source code |
| 50 | + |
| 51 | +The source code for this example is a single *train.py* file. This code executes in every trial of the sweep job. |
52 | 52 |
|
53 | 53 | :::code language="python" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/train-src/train.py" highlight="15":::
|
54 | 54 |
|
55 |
| -### Python SDK |
| 55 | +>[!NOTE] |
| 56 | +>Make sure to log the metrics in the trial component source code with exactly the same name as the `primary_metric` value in the pipeline file. This example uses `mlflow.autolog()`, which is the recommended way to track machine learning experiments. For more information about MLflow, see [Track ML experiments and models with MLflow](./how-to-use-mlflow-cli-runs.md). |
56 | 57 |
|
57 |
| -The Python SDK example can be found in [azureml-example repo](https://github.com/Azure/azureml-examples). Navigate to *azureml-examples/sdk/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep* to check the example. |
| 58 | +### Create a pipeline with a hyperparameter sweep step |
58 | 59 |
|
59 |
| -In Azure Machine Learning Python SDK v2, you can enable hyperparameter tuning for any command component by calling `.sweep()` method. |
| 60 | +# [Azure CLI](#tab/cli) |
60 | 61 |
|
61 |
| -Below code snippet shows how to enable sweep for `train_model`. |
| 62 | +Given the command component defined in *train.yml*, the following code creates a two-step `train` and `predict` pipeline definition file. In the `sweep_step`, the required step type is `sweep`, and the `c_value`, `kernel`, and `coef` hyperparameter inputs for the `trial` component are added to the `search_space`. |
62 | 63 |
|
63 |
| -[!notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep/pipeline_with_hyperparameter_sweep.ipynb?name=enable-sweep)] |
| 64 | +The following example highlights the hyperparameter tuning `sweep_step`. |
64 | 65 |
|
65 |
| - We first load `train_component_func` defined in `train.yml` file. When creating `train_model`, we add `c_value`, `kernel` and `coef0` into search space(line 15-17). Line 30-35 defines the primary metric, sampling algorithm etc. |
| 66 | +:::code language="yaml" source="~/azureml-examples-main/cli/jobs/pipelines-with-components/pipeline_with_hyperparameter_sweep/pipeline.yml" highlight="8-48"::: |
66 | 67 |
|
67 |
| -## Check pipeline job with sweep step in Studio |
| 68 | +# [Python SDK](#tab/python) |
68 | 69 |
|
69 |
| -After you submit a pipeline job, the SDK or CLI widget will give you a web URL link to Studio UI. The link will guide you to the pipeline graph view by default. |
| 70 | +In the v2 SDK, you can enable hyperparameter tuning for any command component by calling the `.sweep()` method. The following pipeline definition shows how to enable sweep for `train_model`. |
70 | 71 |
|
71 |
| -To check details of the sweep step, double click the sweep step and navigate to the **child job** tab in the panel on the right. |
| 72 | +The example first loads the `train_component_func` defined in the *train.yml* file. To create the `train_model`, the code adds the `c_value`, `kernel`, and `coef0` hyperparameters into the search space. The `sweep_step` defines the `primary_metric`, `sampling_algorithm`, and other parameters. |
72 | 73 |
|
73 |
| -:::image type="content" source="./media/how-to-use-sweep-in-pipeline/pipeline-view.png" alt-text="Screenshot of the pipeline with child job and the train_model node highlighted." lightbox= "./media/how-to-use-sweep-in-pipeline/pipeline-view.png"::: |
| 74 | +[!Notebook-python[] (~/azureml-examples-main/sdk/python/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep/pipeline_with_hyperparameter_sweep.ipynb?name=enable-sweep)] |
| 75 | + |
| 76 | +--- |
74 | 77 |
|
75 |
| -This will link you to the sweep job page as seen in the below screenshot. Navigate to **child job** tab, here you can see the metrics of all child jobs and list of all child jobs. |
| 78 | +For the full sweep job schema, see [CLI (v2) sweep job YAML schema](reference-yaml-job-sweep.md). |
76 | 79 |
|
77 |
| -:::image type="content" source="./media/how-to-use-sweep-in-pipeline/sweep-job.png" alt-text="Screenshot of the job page on the child jobs tab." lightbox= "./media/how-to-use-sweep-in-pipeline/sweep-job.png"::: |
| 80 | +### Submit the hyperparameter tuning pipeline job |
| 81 | + |
| 82 | +After you submit this pipeline job, Azure Machine Learning runs the `trial` component multiple times to sweep over hyperparameters, based on the search space and limits you defined in the `sweep_step`. |
| 83 | + |
| 84 | +## View hyperparameter tuning results in studio |
| 85 | + |
| 86 | +After you submit a pipeline job, the SDK or CLI widget gives you a web URL link to the pipeline graph in the Azure Machine Learning studio UI. |
| 87 | + |
| 88 | +To view hyperparameter tuning results, double-click the sweep step in the pipeline graph, select the **Child jobs** tab in the details panel, and then select the child job. |
| 89 | + |
| 90 | +:::image type="content" source="./media/how-to-use-sweep-in-pipeline/pipeline-view.png" alt-text="Screenshot of the pipeline with child job and the train_model node highlighted." lightbox= "./media/how-to-use-sweep-in-pipeline/pipeline-view.png"::: |
78 | 91 |
|
79 |
| -If a child jobs failed, select the name of that child job to enter detail page of that specific child job (see screenshot below). The useful debug information is under **Outputs + Logs**. |
| 92 | +On the child job page, select the **Trials** tab to see and compare metrics for all the child runs. Select any of the child runs to see the details for that run. |
80 | 93 |
|
81 |
| -:::image type="content" source="./media/how-to-use-sweep-in-pipeline/child-run.png" alt-text="Screenshot of the output + logs tab of a child run." lightbox= "./media/how-to-use-sweep-in-pipeline/child-run.png"::: |
| 94 | +:::image type="content" source="./media/how-to-use-sweep-in-pipeline/sweep-job.png" alt-text="Screenshot of the child job page with the Trials tab." lightbox= "./media/how-to-use-sweep-in-pipeline/sweep-job.png"::: |
82 | 95 |
|
83 |
| -## Sample notebooks |
| 96 | +If a child run failed, you can select the **Outputs + logs** tab on the child run page to see useful debug information. |
84 | 97 |
|
85 |
| -- [Build pipeline with sweep node](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/1c_pipeline_with_hyperparameter_sweep/pipeline_with_hyperparameter_sweep.ipynb) |
86 |
| -- [Run hyperparameter sweep on a command job](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/single-step/lightgbm/iris/lightgbm-iris-sweep.ipynb) |
| 98 | +:::image type="content" source="./media/how-to-use-sweep-in-pipeline/child-run.png" alt-text="Screenshot of the output and logs tab of a child run." lightbox= "./media/how-to-use-sweep-in-pipeline/child-run.png"::: |
87 | 99 |
|
88 |
| -## Next steps |
| 100 | +## Related content |
89 | 101 |
|
90 | 102 | - [Track an experiment](how-to-log-view-metrics.md)
|
91 | 103 | - [Deploy a trained model](how-to-deploy-online-endpoints.md)
|
0 commit comments