You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This tutorial shows you how to upload and use your own data to train machine learning models in Azure Machine Learning. This tutorial is *part 3 of a three-part tutorial series*.
20
+
This tutorial shows you how to upload and use your own data to train machine learning models in Azure Machine Learning. This tutorial is *part 3 of a three-part tutorial series*.
21
21
22
-
In [Part 2: Train a model](tutorial-1st-experiment-sdk-train.md), you trained a model in the cloud, using sample data from `PyTorch`. You also downloaded that data through the `torchvision.datasets.CIFAR10` method in the PyTorch API. In this tutorial, you'll use the downloaded data to learn the workflow for working with your own data in Azure Machine Learning.
22
+
In [Part 2: Train a model](tutorial-1st-experiment-sdk-train.md), you trained a model in the cloud, using sample data from `PyTorch`. You also downloaded that data through the `torchvision.datasets.CIFAR10` method in the PyTorch API. In this tutorial, you'll use the downloaded data to learn the workflow for working with your own data in Azure Machine Learning.
23
23
24
24
In this tutorial, you:
25
25
26
26
> [!div class="checklist"]
27
+
>
27
28
> * Upload data to Azure.
28
29
> * Create a control script.
29
-
> * Understand the new Azure Machine Learning concepts (passing parameters, datasets, datastores).
30
+
> * Understand the new Azure Machine Learning concepts (passing parameters, data inputs).
30
31
> * Submit and run your training script.
31
32
> * View your code output in the cloud.
32
33
33
34
## Prerequisites
34
35
35
-
You'll need the data that was downloaded in the previous tutorial. Make sure you have completed these steps:
36
+
You'll need the data that was downloaded in the previous tutorial. Make sure you have completed these steps:
36
37
37
-
1.[Create the training script](tutorial-1st-experiment-sdk-train.md#create-training-scripts).
38
+
1.[Create the training script](tutorial-1st-experiment-sdk-train.md#create-training-scripts).
@@ -43,21 +44,21 @@ By now you have your training script (get-started/src/train.py) running in Azure
43
44
44
45
Our training script is currently set to download the CIFAR10 dataset on each run. The following Python code has been adjusted to read the data from a directory.
45
46
46
-
>[!NOTE]
47
+
>[!NOTE]
47
48
> The use of `argparse` parameterizes the script.
48
49
49
50
1. Open *train.py* and replace it with this code:
50
51
51
-
```python
52
+
```python
52
53
import os
53
54
import argparse
54
55
import torch
55
56
import torch.optim as optim
56
57
import torchvision
57
58
import torchvision.transforms as transforms
58
59
from model import Net
59
-
from azureml.core importRun
60
-
run = Run.get_context()
60
+
importmlflow
61
+
61
62
if__name__=="__main__":
62
63
parser = argparse.ArgumentParser()
63
64
parser.add_argument(
@@ -126,13 +127,13 @@ Our training script is currently set to download the CIFAR10 dataset on each run
126
127
running_loss += loss.item()
127
128
if i %2000==1999:
128
129
loss = running_loss /2000
129
-
run.log('loss', loss)# log loss metric to AML
130
+
mlflow.log_metric('loss', loss)
130
131
print(f'epoch={epoch +1}, batch={i +1:5}: loss {loss:.2f}')
131
132
running_loss =0.0
132
133
print('Finished Training')
133
-
```
134
+
```
134
135
135
-
1. **Save** the file. Close the tab if you wish.
136
+
1.**Save** the file. Close the tab if you wish.
136
137
137
138
### Understanding the code changes
138
139
@@ -158,165 +159,139 @@ optimizer = optim.SGD(
158
159
)
159
160
```
160
161
161
-
162
162
## <aname="upload"></a> Upload the data to Azure
163
163
164
164
To run this script in Azure Machine Learning, you need to make your training data available in Azure. Your Azure Machine Learning workspace comes equipped with a _default_ datastore. This is an Azure Blob Storage account where you can store your training data.
165
165
166
-
>[!NOTE]
167
-
> Azure Machine Learning allows you to connect other cloud-based datastores that store your data. For more details, see the [datastores documentation](./concept-data.md).
168
-
169
-
1. Create a new Python control script in the **get-started** folder (make sure it is in **get-started**, *not* in the **/src** folder). Name the script *upload-data.py* and copy this code into the file:
170
-
171
-
```python
172
-
# upload-data.py
173
-
from azureml.core import Workspace
174
-
from azureml.core import Dataset
175
-
from azureml.data.datapath import DataPath
176
-
177
-
ws = Workspace.from_config()
178
-
datastore = ws.get_default_datastore()
179
-
Dataset.File.upload_directory(src_dir='data',
180
-
target=DataPath(datastore, "datasets/cifar10")
181
-
)
182
-
```
183
-
184
-
The `target_path` value specifies the path on the datastore where the CIFAR10 data will be uploaded.
185
-
186
-
>[!TIP]
187
-
> While you're using Azure Machine Learning to upload the data, you can use [Azure Storage Explorer](https://azure.microsoft.com/features/storage-explorer/) to upload ad hoc files. If you need an ETL tool, you can use [Azure Data Factory](../data-factory/introduction.md) to ingest your data into Azure.
188
-
189
-
2. Select **Save and run script in terminal** to run the *upload-data.py* script.
190
-
191
-
You should see the following standard output:
166
+
> [!NOTE]
167
+
> Azure Machine Learning allows you to connect other cloud-based storages that store your data. For more details, see the [data documentation](./concept-data.md).
192
168
193
-
```txt
194
-
Uploading ./data\cifar-10-batches-py\data_batch_2
195
-
Uploaded ./data\cifar-10-batches-py\data_batch_2, 4 files out of an estimated total of 9
196
-
.
197
-
.
198
-
Uploading ./data\cifar-10-batches-py\data_batch_5
199
-
Uploaded ./data\cifar-10-batches-py\data_batch_5, 9 files out of an estimated total of 9
200
-
Uploaded 9 files
201
-
```
169
+
There is no additional step needed for uploading data, the control script will define and upload the CIFAR10 training data.
202
170
203
171
## <aname="control-script"></a> Create a control script
204
172
205
173
As you've done previously, create a new Python control script called *run-pytorch-data.py* in the **get-started** folder:
A [dataset](/python/api/azureml-core/azureml.core.dataset.dataset) is used to reference the data you uploaded to Azure Blob Storage. Datasets are an abstraction layer on top of your data that are designed to improve reliability and trustworthiness.
245
+
An [Input](/python/api/azure-ai-ml/azure.ai.ml.input) is used to reference inputs to your job. These can encompass data, either uploaded as part of the job or references to previously registered data assets. URI\*FOLDER tells that the reference points to a folder of data. The data will be mounted by default to the compute for the job.
[ScriptRunConfig](/python/api/azureml-core/azureml.core.scriptrunconfig) is modified to include a list of arguments that will be passed into `train.py`. The `dataset.as_named_input('input').as_mount()` argument means the specified directory will be _mounted_ to the compute target.
253
+
`--data_path` matches the argument defined in the updated training script. `${{inputs.data_path}}` passes the input defined by the input dictionary, and the keys must match.
267
254
:::column-end:::
268
255
:::row-end:::
269
256
270
-
## <aname="submit-to-cloud"></a> Submit the run to Azure Machine Learning
257
+
## <aname="submit-to-cloud"></a> Submit the job to Azure Machine Learning
271
258
272
-
Select **Save and run script in terminal**to run the *run-pytorch-data.py* script. This run will train the model on the compute cluster using the data you uploaded.
259
+
Select **Save and run script in terminal** to run the *run-pytorch-data.py* script. This job will train the model on the compute cluster using the data you uploaded.
273
260
274
261
This code will print a URL to the experiment in the Azure Machine Learning studio. If you go to that link, you'll be able to see your code running.
### <aname="inspect-log"></a> Inspect the log file
280
266
281
267
In the studio, go to the experiment job (by selecting the previous URL output) followed by **Outputs + logs**. Select the `std_log.txt` file. Scroll down through the log file until you see the following output:
Current directory: /mnt/batch/tasks/shared/LS_root/jobs/dsvm-aml/azureml/tutorial-session-3_1600171983_763c5381/mounts/workspaceblobstore/azureml/tutorial-session-3_1600171983_763c5381
305
-
Preparing to call script [ train.py ] with arguments: ['--data_path', '$input', '--learning_rate', '0.003', '--momentum', '0.92']
306
-
After variable expansion, calling script [ train.py ] with arguments: ['--data_path', '/tmp/tmp9kituvp3', '--learning_rate', '0.003', '--momentum', '0.92']
307
-
308
-
Script type = None
309
270
===== DATA =====
310
-
DATA PATH: /tmp/tmp9kituvp3
271
+
DATA PATH: /mnt/azureml/cr/j/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/cap/data-capability/wd/INPUT_data_path
- Azure Machine Learning has mounted Blob Storage to the compute cluster automatically for you.
318
-
- The ``dataset.as_named_input('input').as_mount()`` used in the control script resolves to the mount point.
319
-
293
+
- Azure Machine Learning has mounted Blob Storage to the compute cluster automatically for you, passing the mount point into `--data_path`. Compared to the previous job, there is no on the fly data download.
294
+
- The `inputs=my_job_inputs` used in the control script resolves to the mount point.
320
295
321
296
## Clean up resources
322
297
@@ -331,7 +306,6 @@ If you're not going to use it now, stop the compute instance:
0 commit comments