Skip to content

Commit 9b8f68d

Browse files
committed
update steps in article
1 parent 8f819de commit 9b8f68d

File tree

1 file changed

+57
-37
lines changed

1 file changed

+57
-37
lines changed

articles/machine-learning/how-to-train-scikit-learn.md

Lines changed: 57 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -43,70 +43,90 @@ You can run this code in either an Azure Machine Learning compute instance, or y
4343

4444
## Set up the experiment
4545

46-
This section sets up the training experiment by loading the required Python packages, initializing a workspace, defining the training environment, and preparing the training script.
46+
This section sets up the training experiment by loading the required Python packages, connecting to a workspace, creating a compute resource to run a training job, and creating an environment to run the job.
4747

48-
### Initialize a workspace
48+
### Connect to the workspace
4949

50-
The [Azure Machine Learning workspace](concept-workspace.md) is the top-level resource for the service. It provides you with a centralized place to work with all the artifacts you create.
50+
First, you'll need to connect to your Azure Machine Learning workspace. The [AzureML workspace](concept-workspace.md) is the top-level resource for the service. It provides you with a centralized place to work with all the artifacts you create when you use Azure Machine Learning.
5151

52-
First, you'll need to connect to your Azure ML workspace. The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning.
52+
We are using `DefaultAzureCredential` to get access to the workspace. `DefaultAzureCredential` should be capable of handling most Azure SDK authentication scenarios.
5353

54-
We are using DefaultAzureCredential to get access to workspace. DefaultAzureCredential should be capable of handling most Azure SDK authentication scenarios.
54+
<!-- M.A: link to "configure credential example" is missing (broken in notebook) -->
55+
If this credential does not work for you, see configure credential example and [`azure-identity reference documentation`](/python/api/azure-identity/azure.identity?view=azure-python) for more available credentials.
5556

56-
Reference for more available credentials if it does not work for you: configure credential example, azure-identity reference doc.
57+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=credential)]
5758

58-
<!-- In the Python SDK, you can access the workspace artifacts by creating a [`workspace`](/python/api/azureml-core/azureml.core.workspace.workspace) object. -->
59+
If you prefer to use a browser to sign in and authenticate, you can use the following code instead:
5960

60-
<!-- Create a workspace object from the `config.json` file created in the [prerequisites section](#prerequisites). -->
61+
```python
62+
# Handle to the workspace
63+
from azure.ai.ml import MLClient
6164

62-
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=credential)]
65+
# Authentication package
66+
from azure.identity import InteractiveBrowserCredential
6367

68+
credential = InteractiveBrowserCredential()
69+
```
6470

65-
### Prepare scripts
71+
Next, get a handle to the workspace by providing your Subscription ID, Resource Group name, and Workspace name. To find your Subscription ID and Resource Group:
6672

67-
In this tutorial, the [training script **train_iris.py**](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train_iris.py) is already provided for you. In practice, you should be able to take any custom training script as is and run it with Azure ML without having to modify your code.
73+
1. Select your workspace name from the upper-right corner of the Azure Machine Learning Studio toolbar.
74+
2. Copy the value for Resource group and Subscription ID into the code.
6875

69-
Notes:
70-
- The provided training script shows how to log some metrics to your Azure ML run using the `Run` object within the script.
71-
- The provided training script uses example data from the `iris = datasets.load_iris()` function. To use and access your own data, see [how to train with datasets](v1/how-to-train-with-datasets.md) to make data available during training.
76+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=ml_client)]
7277

73-
### Define your environment
78+
The result of this example script is a workspace handle that you'll use to manage other resources and jobs.
7479

75-
To define the Azure ML [Environment](concept-environments.md) that encapsulates your training script's dependencies, you can either define a custom environment or use and Azure ML curated environment.
80+
Note:
7681

77-
#### Use a curated environment
78-
Optionally, Azure ML provides prebuilt, [curated environments](resource-curated-environments.md) if you don't want to define your own environment.
82+
- Creating `MLClient` will not connect the client to the workspace. The client initialization is lazy and will wait for the first time it needs to make a call. In this article, this will happen during compute creation.
7983

80-
If you want to use a curated environment, you can run the following command instead:
84+
### Create a Compute Resource to run the job
8185

82-
```python
83-
from azureml.core import Environment
86+
AzureML needs a compute resource to run a job. This resource can be single or multi-node machines with Linux or Windows OS, or a specific compute fabric like Spark.
8487

85-
sklearn_env = Environment.get(workspace=ws, name='AzureML-Tutorial')
86-
```
88+
<!-- MA: find proper way to link to the marketing page (second link) -->
89+
In the following example script, we provision a Linux [`compute cluster`](/azure/machine-learning/how-to-create-attach-compute-cluster?tabs=python). You can see the [`Azure Machine Learning pricing`](https://azure.microsoft.com/en-us/pricing/details/machine-learning/) page for the full list of VM sizes and prices. Also, we only need a basic cluster for this example. Let's pick a Standard_DS3_v2 model with 2 vCPU cores and 7 GB RAM to create an AzureML Compute.
90+
91+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=cpu_compute_target)]
92+
93+
### Create a job environment
94+
95+
To run an AzureML job, you'll need an environment. An AzureML [Environment](concept-environments.md) encapsulates the dependencies (such as software runtime and libraries) needed to run your machine learning training script on your compute resource. This environment is similar to a Python environment on your local machine.
96+
97+
AzureML allows you to use either a curated (or ready-made) environment or define a custom environment using a Docker image or a Conda configuration. This article uses a custom environment.
8798

8899
#### Create a custom environment
89100

90-
You can also create your own your own custom environment. Define your conda dependencies in a YAML file; in this example the file is named `conda_dependencies.yml`.
101+
To create your custom environment, you'll define your Conda dependencies in a YAML file. First, create a directory for storing the file. In this example, we've named the directory `dependencies_dir.yml`.
91102

92-
```yaml
93-
dependencies:
94-
- python=3.6.2
95-
- scikit-learn
96-
- numpy
97-
- pip:
98-
- azureml-defaults
99-
```
103+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=make_env_folder)]
100104

101-
Create an Azure ML environment from this Conda environment specification. The environment will be packaged into a Docker container at runtime.
102-
```python
103-
from azureml.core import Environment
105+
Then, create the file in the dependencies directory. In this example, we've named the file `conda.yml`.
104106

105-
sklearn_env = Environment.from_conda_specification(name='sklearn-env', file_path='conda_dependencies.yml')
106-
```
107+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=make_conda_file)]
108+
109+
The specification contains some usual packages (such as numpy and pip) that you'll use in your job.
110+
111+
Next, use the YAML file to create and register this custom environment in your workspace.
112+
113+
[!notebook-python[](~/azureml-examples-v2samplesreorg/sdk/python/jobs/single-step/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train-hyperparameter-tune-with-sklearn.ipynb?name=custom_environment)]
107114

108115
For more information on creating and using environments, see [Create and use software environments in Azure Machine Learning](how-to-use-environments.md).
109116

117+
### Data for training
118+
119+
<!-- ### Prepare scripts
120+
121+
For this tutorial, we've provided the [training script **train_iris.py**](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/scikit-learn/train-hyperparameter-tune-deploy-with-sklearn/train_iris.py) for you. In practice, you should be able to take any custom training script as is and run it with AzureML without having to modify your code.
122+
123+
Notes:
124+
125+
The provided training script,
126+
- Shows how to log some metrics to your AzureML run using the `Run` object .
127+
- Uses example data from the `iris = datasets.load_iris()` function. To use and access your own data, see [how to train with datasets](v1/how-to-train-with-datasets.md) to make data available during training. -->
128+
129+
110130
## Configure and submit your training run
111131

112132
### Create a ScriptRunConfig

0 commit comments

Comments
 (0)