Skip to content

Commit 44aacc2

Browse files
committed
Acrolinx updates . . .
1 parent 2e32583 commit 44aacc2

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

articles/machine-learning/v1/how-to-train-with-datasets.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ When you **mount** a dataset, you attach the files referenced by the dataset to
235235
When you **download** a dataset, all the files referenced by the dataset are downloaded to the compute target. Downloading is supported for all compute types. If your script processes all files referenced by the dataset, and your compute disk can fit your full dataset, downloading is recommended to avoid the overhead of streaming data from storage services. For multi-node downloads, see [how to avoid throttling](#troubleshooting).
236236

237237
> [!NOTE]
238-
> The download path name should not be longer than 255 alpha-numeric characters for Windows OS. For Linux OS, the download path name should not be longer than 4,096 alpha-numeric characters. Also, for Linux OS the file name (which is the last segment of the download path `/path/to/file/{filename}`) should not be longer than 255 alpha-numeric characters.
238+
> The download path name shouldn't be longer than 255 alpha-numeric characters for Windows OS. For Linux OS, the download path name shouldn't be longer than 4,096 alpha-numeric characters. Also, for Linux OS the file name (which is the last segment of the download path `/path/to/file/{filename}`) shouldn't be longer than 255 alpha-numeric characters.
239239
240240
The following code mounts `dataset` to the temp directory at `mounted_path`
241241

@@ -255,7 +255,7 @@ print (mounted_path)
255255

256256
## Get datasets in machine learning scripts
257257

258-
Registered datasets are accessible both locally and remotely on compute clusters like the Azure Machine Learning compute. To access your registered dataset across experiments, use the following code to access your workspace and get the dataset that was used in your previously submitted run. By default, the [`get_by_name()`](/python/api/azureml-core/azureml.core.dataset.dataset#get-by-name-workspace--name--version--latest--) method on the `Dataset` class returns the latest version of the dataset that's registered with the workspace.
258+
Registered datasets are accessible both locally and remotely on compute clusters like the Azure Machine Learning compute. To access your registered dataset across experiments, use the following code to access your workspace and get the dataset that was used in your previously submitted run. By default, the [`get_by_name()`](/python/api/azureml-core/azureml.core.dataset.dataset#get-by-name-workspace--name--version--latest--) method on the `Dataset` class returns the latest version of the dataset registered with the workspace.
259259

260260
```Python
261261
%%writefile $script_folder/train.py
@@ -276,7 +276,7 @@ df = titanic_ds.to_pandas_dataframe()
276276

277277
## Access source code during training
278278

279-
Azure Blob storage has higher throughput speeds than an Azure file share, and will scale to large numbers of jobs started in parallel. For this reason, we recommend configuring your runs to use Blob storage for transferring source code files.
279+
Azure Blob storage has higher throughput speeds than an Azure file share, and scales to large numbers of jobs started in parallel. For this reason, we recommend configuring your runs to use Blob storage for transferring source code files.
280280

281281
The following code example specifies in the run configuration which blob datastore to use for source code transfers.
282282

@@ -313,7 +313,7 @@ myenv.environment_variables = {"AZUREML_DOWNLOAD_CONCURRENCY":64}
313313

314314
**Unable to upload project files to working directory in AzureFile because the storage is overloaded**:
315315

316-
* If you use file share for other workloads, such as data transfer, the recommendation is to use blobs so that file share is free to be used for submitting runs.
316+
* If you use file share for other workloads, such as data transfer, we recommend that you use blobs so that file share is free to be used for submitting runs.
317317

318318
* You can also split the workload between two different workspaces.
319319

0 commit comments

Comments
 (0)