You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/service/how-to-set-up-training-targets.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,7 +48,7 @@ Learn more about [submitting experiments](#submit) at the end of this article.
48
48
49
49
## What's an estimator?
50
50
51
-
To facilitate model training using popular frameworks, the Azure Machine Learning Python SDK provides an alternative higher-level abstraction, the estimator class. We recommend using an esimator for training since the class contains methods that allow you to easily construct and customize run configurations. You can create and use a generic [Estimator](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py) to submit training scripts that use any learning framework you choose (such as scikit-learn). If you need to make your data files available to your compute target, see [Train with Azure Machine Learning datasets](how-to-train-with-datasets.md).
51
+
To facilitate model training using popular frameworks, the Azure Machine Learning Python SDK provides an alternative higher-level abstraction, the estimator class. We recommend using an estimator for training since the class contains methods that allow you to easily construct and customize run configurations. You can create and use a generic [Estimator](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py) to submit training scripts that use any learning framework you choose (such as scikit-learn). If you need to make your data files available to your compute target, see [Train with Azure Machine Learning datasets](how-to-train-with-datasets.md).
52
52
53
53
For PyTorch, TensorFlow, and Chainer tasks, Azure Machine Learning also provides respective [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py), and [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py) estimators to simplify using these frameworks.
54
54
@@ -359,7 +359,7 @@ For more information, see [Resource management](reference-azure-machine-learning
359
359
360
360
## Set up with VS Code
361
361
362
-
You can access, create and manage the compute targets that are associated with your workspace using the [VS Code extension](how-to-vscode-tools.md#create-and-manage-compute-targets) for Azure Machine Learning.
362
+
You can access, create, and manage the compute targets that are associated with your workspace using the [VS Code extension](how-to-vscode-tools.md#create-and-manage-compute-targets) for Azure Machine Learning.
363
363
364
364
## <aid="submit"></a>Submit training run using Azure Machine Learning SDK
Copy file name to clipboardExpand all lines: articles/machine-learning/service/resource-known-issues.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,9 +88,9 @@ Binary classification charts (precision-recall, ROC, gain curve etc.) shown in a
88
88
89
89
These are known issues for Azure Machine Learning Datasets.
90
90
91
-
### TypeError: File not found
91
+
### TypeError: FileNotFound: No such file or directory
92
92
93
-
This error occurs if you attempt to use the relative path instead of the absolute path of the file(s) in your datastore or dataset that you want to mount to your compute target. When you use `as_mount()` or `mount()` include a leading forward slash, `/`, to ensure you are mounting your dataset relative to your compute target, instead of your working directory.
93
+
This error occurs if the file path you attempt to mount to your compute target can't be found. This happens if you use the relative path instead of the absolute path of the file(s) in your datastore or dataset. When you use `as_mount()` or `mount()` include a leading forward slash, `/`, to ensure you are mounting your dataset relative to your compute target, instead of your working directory.
94
94
95
95
```python
96
96
# Note the leading / in '/tmp/dataset'
@@ -221,9 +221,9 @@ az aks get-credentials -g <rg> -n <aks cluster name>
221
221
Updates to Azure Machine Learning components installed in an Azure Kubernetes Service cluster must be manually applied.
222
222
223
223
> [!WARNING]
224
-
> Before performing the following actions, check the version of your Azure Kubernetes Service cluster. If the cluster version is equal to or greater than 1.14, you will not be able to re-attach your cluster to the Azure Machine Learning workspace.
224
+
> Before performing the following actions, check the version of your Azure Kubernetes Service cluster. If the cluster version is equal to or greater than 1.14, you will not be able to reattach your cluster to the Azure Machine Learning workspace.
225
225
226
-
You can apply these updates by detaching the cluster from the Azure Machine Learning workspace, and then re-attaching the cluster to the workspace. If SSL is enabled in the cluster, you will need to supply the SSL certificate and private key when re-attaching the cluster.
226
+
You can apply these updates by detaching the cluster from the Azure Machine Learning workspace, and then reattaching the cluster to the workspace. If SSL is enabled in the cluster, you will need to supply the SSL certificate and private key when reattaching the cluster.
@@ -261,19 +261,19 @@ If you are running into ModuleErrors while submitting experiments in Azure ML, i
261
261
262
262
If you are using [Estimators](concept-azure-machine-learning-architecture.md#estimators) to submit experiments, you can specify a package name via `pip_packages` or `conda_packages` parameter in the estimator based on from which source you want to install the package. You can also specify a yml file with all your dependencies using `conda_dependencies_file`or list all your pip requirements in a txt file using `pip_requirements_file` parameter.
263
263
264
-
Azure ML also provides frameworkspecific estimators for Tensorflow, PyTorch, Chainer and SKLearn. Using these estimators will make sure that the framework dependencies are installed on your behalf in the environment used for training. You have the option to specify extra dependencies as described above.
264
+
Azure ML also provides framework-specific estimators for Tensorflow, PyTorch, Chainer and SKLearn. Using these estimators will make sure that the framework dependencies are installed on your behalf in the environment used for training. You have the option to specify extra dependencies as described above.
265
265
266
266
Azure ML maintained docker images and their contents can be seen in [AzureML Containers](https://github.com/Azure/AzureML-Containers).
267
-
Frameworkspecific dependencies are listed in the respective framework documentation - [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py#remarks), [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py#remarks), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py#remarks), [SKLearn](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py#remarks).
267
+
Framework-specific dependencies are listed in the respective framework documentation - [Chainer](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.chainer?view=azure-ml-py#remarks), [PyTorch](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.pytorch?view=azure-ml-py#remarks), [TensorFlow](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.dnn.tensorflow?view=azure-ml-py#remarks), [SKLearn](https://docs.microsoft.com/python/api/azureml-train-core/azureml.train.sklearn.sklearn?view=azure-ml-py#remarks).
268
268
269
269
>[Note!]
270
270
> If you think a particular package is common enough to be added in Azure ML maintained images and environments please raise a GitHub issue in [AzureML Containers](https://github.com/Azure/AzureML-Containers).
271
271
272
272
### NameError (Name not defined), AttributeError (Object has no attribute)
273
273
This exception should come from your training scripts. You can look at the log files from Azure portal to get more information about the specific name not defined or attribute error. From the SDK, you can use `run.get_details()` to look at the error message. This will also list all the log files generated for your run. Please make sure to take a look at your training script, fix the error before retrying.
274
274
275
-
### Horovod is shutdown
276
-
In most cases, this exception means there was an underlying exception in one of the processes that caused horovod to shutdown. Each rank in the MPI job gets it own dedicated log file in Azure ML. These logs are named `70_driver_logs`. In case of distributed training, the log names are suffixed with `_rank` to make it easy to differentiate the logs. To find the exact error that caused horovod shutdown, go through all the log files and look for `Traceback` at the end of the driver_log files. One of these files will give you the actual underlying exception.
275
+
### Horovod is shut down
276
+
In most cases, this exception means there was an underlying exception in one of the processes that caused horovod to shut down. Each rank in the MPI job gets it own dedicated log file in Azure ML. These logs are named `70_driver_logs`. In case of distributed training, the log names are suffixed with `_rank` to make it easy to differentiate the logs. To find the exact error that caused horovod shutdown, go through all the log files and look for `Traceback` at the end of the driver_log files. One of these files will give you the actual underlying exception.
277
277
278
278
## Labeling projects issues
279
279
@@ -291,6 +291,6 @@ Manually refresh the page. Initialization should proceed at roughly 20 datapoint
291
291
292
292
To load all labeled images, choose the **First** button. The **First** button will take you back to the front of the list, but loads all labeled data.
293
293
294
-
### Pressing Esc key while labeling for object detection creates a zero size label on the topleft corner. Submitting labels in this state fails.
294
+
### Pressing Esc key while labeling for object detection creates a zero size label on the top-left corner. Submitting labels in this state fails.
295
295
296
296
Delete the label by clicking on the cross mark next to it.
0 commit comments