You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this article, learn how to easily access your data in Azure storage services via Azure Machine Learning datastores. Datastores are used to store connection information, like your subscription ID and token authorization. Using datastores allows you to access your storage without having to hard code connection information in your scripts. You can create datastores from these [Azure storage solutions](#matrix). For unsupported storage solutions, to save data egress cost during machine learning experiments, we recommend you move your data to our supported Azure storage solutions. [Learn how to move your data](#move).
22
+
In this article, learn how to easily access your data in Azure storage services via Azure Machine Learning datastores. Datastores are used to store connection information, like your subscription ID and token authorization. Using datastores allows you to access your storage without having to hard code connection information in your scripts. You can create datastores from these [Azure storage solutions](#matrix). For unsupported storage solutions, and to save data egress cost during machine learning experiments, we recommend you move your data to our supported Azure storage solutions. [Learn how to move your data](#move).
23
23
24
24
This how-to shows examples of the following tasks:
25
-
*[Register datastores](#access)
26
-
*[Get datastores from workspace](#get)
27
-
*[Upload and download data using datastores](#up-and-down)
28
-
*[Access data during training](#train)
29
-
*[Move data to Azure](#move)
25
+
* Register datastores
26
+
* Get datastores from workspace
27
+
* Upload and download data using datastores
28
+
* Access data during training
29
+
* Move data to an Azure storage service
30
30
31
31
## Prerequisites
32
-
32
+
You'll need
33
33
- An Azure subscription. If you don’t have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://aka.ms/AMLFree) today.
34
34
35
35
- An Azure storage account with an [Azure Blob Container](https://docs.microsoft.com/azure/storage/blobs/storage-blobs-overview) or [Azure File Share](https://docs.microsoft.com/azure/storage/files/storage-files-introduction).
@@ -56,7 +56,14 @@ When you register an Azure storage solution as a datastore, you automatically cr
56
56
57
57
All the register methods are on the [`Datastore`](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore(class)?view=azure-ml-py) class and have the form register_azure_*.
58
58
59
-
The information you need to populate the register() method can be found via [Azure portal](https://portal.azure.com). Select **Storage Accounts** on the left pane and choose the storage account you want to register. The **Overview** page provides information such as, the account name and container or file share name. For authentication information, like account key or SAS token, navigate to **Account Keys** under the **Settings** pane on the left.
59
+
The information you need to populate the register() method can be found via the [Azure Machine Learning studio](https://ml.azure.com) and these steps
60
+
61
+
1. Select **Storage Accounts** on the left pane and choose the storage account you want to register.
62
+
2. The **Overview** page provides information such as, the account name and container or file share name.
63
+
3. For authentication information, like account key or SAS token, navigate to **Account Keys** under the **Settings** pane on the left.
64
+
65
+
>[IMPORTANT]
66
+
> If your storage account is in a VNET, only Azure blob datastore creation is supported. Set the parameter, `grant_workspace_access` to `True` to grant your workspace access to your storage account.
60
67
61
68
The following examples show you to register an Azure Blob Container or an Azure File Share as a datastore.
62
69
@@ -72,7 +79,6 @@ The following examples show you to register an Azure Blob Container or an Azure
72
79
account_key='your storage account key',
73
80
create_if_not_exists=True)
74
81
```
75
-
If your storage account isin a VNET, only Azure blob datastore creation is supported. Set the parameter, `grant_workspace_access` to `True` to grant your workspace access to your storage account.
76
82
77
83
+ For an **Azure File Share Datastore**, use [`register_azure_file_share()`](https://docs.microsoft.com/python/api/azureml-core/azureml.core.datastore(class)?view=azure-ml-py#register-azure-file-share-workspace--datastore-name--file-share-name--account-name--sas-token-none--account-key-none--protocol-none--endpoint-none--overwrite-false--create-if-not-exists-false--skip-validation-false-).
78
84
@@ -102,7 +108,7 @@ Create a new datastore in a few steps in Azure Machine Learning studio.
102
108
103
109
The information you need to populate the form can be found via [Azure portal](https://portal.azure.com). Select **Storage Accounts** on the left pane and choose the storage account you want to register. The **Overview** page provides information such as, the account name and container orfile share name. For authentication items, like account key orSAS token, navigate to **Account Keys** under the **Settings** pane on the left.
104
110
105
-
The following example demonstrates what the form would look like forcreating an Azure blob datastore.
111
+
The following example demonstrates what the form looks like for Azure blob datastore creation.
@@ -126,7 +132,7 @@ for name, datastore in datastores.items():
126
132
print(name, datastore.datastore_type)
127
133
```
128
134
129
-
When you create a workspace, an Azure Blob Container and an Azure File Share are registered to the workspace named `workspaceblobstore`and`workspacefilestore` respectively. They store the connection information of the Blob Container and the File Share that is provisioned in the storage account attached to the workspace. The `workspaceblobstore`issetas the default datastore.
135
+
When you create a workspace, an Azure Blob Container and an Azure File Share are automatically registered to the workspace named `workspaceblobstore`and`workspacefilestore` respectively. These store the connection information of the Blob Container and the File Share that is provisioned in the storage account attached to the workspace. The `workspaceblobstore`issetas the default datastore.
130
136
131
137
To get the workspace's default datastore:
132
138
@@ -187,7 +193,7 @@ The following table lists the methods that tell the compute target how to use th
187
193
188
194
Way|Method|Description|
189
195
----|-----|--------
190
-
Mount| [`as_mount()`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.azure_storage_datastore.abstractazurestoragedatastore?view=azure-ml-py#as-mount--)| Use to mount the datastore on the compute target.
196
+
Mount| [`as_mount()`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.azure_storage_datastore.abstractazurestoragedatastore?view=azure-ml-py#as-mount--)| Use to mount the datastore on the compute target. When mounted, all files of your datastore are made accessible to your compute target.
191
197
Download|[`as_download()`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.azure_storage_datastore.abstractazurestoragedatastore?view=azure-ml-py#as-download-path-on-compute-none-)|Use to download the contents of your datastore to the location specified by `path_on_compute`. <br><br> This download happens before the run.
192
198
Upload|[`as_upload()`](https://docs.microsoft.com/python/api/azureml-core/azureml.data.azure_storage_datastore.abstractazurestoragedatastore?view=azure-ml-py#as-upload-path-on-compute-none-)| Use to upload a file from the location specified by `path_on_compute` to your datastore. <br><br> This upload happens after your run.
The following code examples are specific to the [`Estimator`](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator.estimator?view=azure-ml-py) classfor accessing data during training.
214
+
The following code examples are specific to the [`Estimator`](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator.estimator?view=azure-ml-py) classfor accessing data during training.
209
215
210
216
`script_params`is a dictionary containing parameters to the entry_script. Use it to passin a datastore and describe how data is made available on the compute target. Learn more from our end-to-end [tutorial](tutorial-train-models-with-aml.md).
211
217
212
218
```Python
213
219
from azureml.train.estimator import Estimator
214
220
221
+
# notice '/' is in front, this indicates the absolute path
Copy file name to clipboardExpand all lines: articles/machine-learning/service/how-to-set-up-training-targets.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ Learn more about [submitting experiments](#submit) at the end of this article.
48
48
49
49
## What's an estimator?
50
50
51
-
To facilitate model training using popular frameworks, the Azure Machine Learning Python SDK provides an alternative higher-level abstraction, the estimator class. This class allows you to easily construct run configurations. You can create and use a generic [Estimator](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py) to submit training scripts that use any learning framework you choose (such as scikit-learn).
51
+
To facilitate model training using popular frameworks, the Azure Machine Learning Python SDK provides an alternative higher-level abstraction, the estimator class. We recommend using an estimator for training since the class contains methods that allow you to easily construct and customize run configurations. You can create and use a generic [Estimator](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py) to submit training scripts that use any learning framework you choose (such as scikit-learn). If you need to make your data files available to your compute target, see [Train with Azure Machine Learning datasets](how-to-train-with-datasets.md).
52
52
53
53
For PyTorch, TensorFlow, and Chainer tasks, Azure Machine Learning also provides respective [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py), and [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py) estimators to simplify using these frameworks.
54
54
@@ -359,7 +359,7 @@ For more information, see [Resource management](reference-azure-machine-learning
359
359
360
360
## Set up with VS Code
361
361
362
-
You can access, create and manage the compute targets that are associated with your workspace using the [VS Code extension](how-to-vscode-tools.md#create-and-manage-compute-targets) for Azure Machine Learning.
362
+
You can access, create, and manage the compute targets that are associated with your workspace using the [VS Code extension](how-to-vscode-tools.md#create-and-manage-compute-targets) for Azure Machine Learning.
363
363
364
364
## <aid="submit"></a>Submit training run using Azure Machine Learning SDK
Copy file name to clipboardExpand all lines: articles/machine-learning/service/resource-known-issues.md
+21-7Lines changed: 21 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,6 +88,20 @@ Binary classification charts (precision-recall, ROC, gain curve etc.) shown in a
88
88
89
89
These are known issues for Azure Machine Learning Datasets.
90
90
91
+
### TypeError: FileNotFound: No such file or directory
92
+
93
+
This error occurs if the file path you provide isn't where the file is located. You need to make sure the way you refer to the file is consistent with where you mounted your dataset on your compute target. To ensure a deterministic state, we recommend using the abstract path when mounting a dataset to a compute target. For example, in the following code we mount the dataset under the root of the filesystem of the compute target, `/tmp`.
If you don't include the leading forward slash, '/', you'll need to prefix the working directory e.g.
103
+
`/mnt/batch/.../tmp/dataset` on the compute target to indicate where you want the dataset to be mounted.
104
+
91
105
### Fail to read Parquet file from HTTP or ADLS Gen 2
92
106
93
107
There is a known issue in AzureML DataPrep SDK version 1.1.25 that causes a failure when creating a dataset by reading Parquet files from HTTP or ADLS Gen 2. It will fail with `Cannot seek once reading started.`. To fix this issue, please upgrade `azureml-dataprep` to a version higher than 1.1.26, or downgrade to a version lower than 1.1.24.
@@ -211,9 +225,9 @@ az aks get-credentials -g <rg> -n <aks cluster name>
211
225
Updates to Azure Machine Learning components installed in an Azure Kubernetes Service cluster must be manually applied.
212
226
213
227
> [!WARNING]
214
-
> Before performing the following actions, check the version of your Azure Kubernetes Service cluster. If the cluster version is equal to or greater than 1.14, you will not be able to re-attach your cluster to the Azure Machine Learning workspace.
228
+
> Before performing the following actions, check the version of your Azure Kubernetes Service cluster. If the cluster version is equal to or greater than 1.14, you will not be able to reattach your cluster to the Azure Machine Learning workspace.
215
229
216
-
You can apply these updates by detaching the cluster from the Azure Machine Learning workspace, and then re-attaching the cluster to the workspace. If SSL is enabled in the cluster, you will need to supply the SSL certificate and private key when re-attaching the cluster.
230
+
You can apply these updates by detaching the cluster from the Azure Machine Learning workspace, and then reattaching the cluster to the workspace. If SSL is enabled in the cluster, you will need to supply the SSL certificate and private key when reattaching the cluster.
@@ -251,19 +265,19 @@ If you are running into ModuleErrors while submitting experiments in Azure ML, i
251
265
252
266
If you are using [Estimators](concept-azure-machine-learning-architecture.md#estimators) to submit experiments, you can specify a package name via `pip_packages` or `conda_packages` parameter in the estimator based on from which source you want to install the package. You can also specify a yml file with all your dependencies using `conda_dependencies_file`or list all your pip requirements in a txt file using `pip_requirements_file` parameter.
253
267
254
-
Azure ML also provides frameworkspecific estimators for Tensorflow, PyTorch, Chainer and SKLearn. Using these estimators will make sure that the framework dependencies are installed on your behalf in the environment used for training. You have the option to specify extra dependencies as described above.
268
+
Azure ML also provides framework-specific estimators for Tensorflow, PyTorch, Chainer and SKLearn. Using these estimators will make sure that the framework dependencies are installed on your behalf in the environment used for training. You have the option to specify extra dependencies as described above.
255
269
256
270
Azure ML maintained docker images and their contents can be seen in [AzureML Containers](https://github.com/Azure/AzureML-Containers).
257
-
Frameworkspecific dependencies are listed in the respective framework documentation - [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py#remarks), [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py#remarks), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py#remarks), [SKLearn](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py#remarks).
271
+
Framework-specific dependencies are listed in the respective framework documentation - [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py#remarks), [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py#remarks), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py#remarks), [SKLearn](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py#remarks).
258
272
259
273
>[Note!]
260
274
> If you think a particular package is common enough to be added in Azure ML maintained images and environments please raise a GitHub issue in [AzureML Containers](https://github.com/Azure/AzureML-Containers).
261
275
262
276
### NameError (Name not defined), AttributeError (Object has no attribute)
263
277
This exception should come from your training scripts. You can look at the log files from Azure portal to get more information about the specific name not defined or attribute error. From the SDK, you can use `run.get_details()` to look at the error message. This will also list all the log files generated for your run. Please make sure to take a look at your training script, fix the error before retrying.
264
278
265
-
### Horovod is shutdown
266
-
In most cases, this exception means there was an underlying exception in one of the processes that caused horovod to shutdown. Each rank in the MPI job gets it own dedicated log file in Azure ML. These logs are named `70_driver_logs`. In case of distributed training, the log names are suffixed with `_rank` to make it easy to differentiate the logs. To find the exact error that caused horovod shutdown, go through all the log files and look for `Traceback` at the end of the driver_log files. One of these files will give you the actual underlying exception.
279
+
### Horovod is shut down
280
+
In most cases, this exception means there was an underlying exception in one of the processes that caused horovod to shut down. Each rank in the MPI job gets it own dedicated log file in Azure ML. These logs are named `70_driver_logs`. In case of distributed training, the log names are suffixed with `_rank` to make it easy to differentiate the logs. To find the exact error that caused horovod shutdown, go through all the log files and look for `Traceback` at the end of the driver_log files. One of these files will give you the actual underlying exception.
267
281
268
282
## Labeling projects issues
269
283
@@ -281,6 +295,6 @@ Manually refresh the page. Initialization should proceed at roughly 20 datapoint
281
295
282
296
To load all labeled images, choose the **First** button. The **First** button will take you back to the front of the list, but loads all labeled data.
283
297
284
-
### Pressing Esc key while labeling for object detection creates a zero size label on the topleft corner. Submitting labels in this state fails.
298
+
### Pressing Esc key while labeling for object detection creates a zero size label on the top-left corner. Submitting labels in this state fails.
285
299
286
300
Delete the label by clicking on the cross mark next to it.
0 commit comments