Skip to content

Commit ffb2c8f

Browse files
authored
Merge pull request #108092 from lobrien/1677829-Conda-pinning
Added notes and crosslinks on environment caching
2 parents e5396a9 + dcac92b commit ffb2c8f

File tree

3 files changed

+11
-5
lines changed

3 files changed

+11
-5
lines changed

articles/machine-learning/concept-environments.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: core
88
ms.topic: conceptual
99
ms.author: trbye
1010
author: trevorbye
11-
ms.date: 01/06/2020
11+
ms.date: 03/18/2020
1212
---
1313

1414
# What are Azure Machine Learning environments?
@@ -87,7 +87,10 @@ See the following diagram that shows three environment definitions. Two of them
8787

8888
![Diagram of environment caching as Docker images](./media/concept-environments/environment-caching.png)
8989

90-
If you create an environment with unpinned package dependency, for example ```numpy```, that environment will keep using the package version installed at the time of environment creation. Also, any future environment with matching definition will keep using the old version. To update the package, specify a version number to force image rebuild, for example ```numpy==1.18.1```. Note that new dependencies, including nested ones will be installed that might break a previously working scenario
90+
>[!IMPORTANT]
91+
> If you create an environment with an unpinned package dependency, for example ```numpy```, that environment will keep using the package version installed _at the time of environment creation_. Also, any future environment with matching definition will keep using the old version.
92+
93+
To update the package, specify a version number to force image rebuild, for example ```numpy==1.18.1```. Note that new dependencies, including nested ones will be installed that might break a previously working scenario.
9194

9295
> [!WARNING]
9396
> The [Environment.build](https://docs.microsoft.com/python/api/azureml-core/azureml.core.environment.environment?view=azure-ml-py#build-workspace--image-build-compute-none-) method will rebuild the cached image, with possible side-effect of updating unpinned packages and breaking reproducibility for all environment definitions corresponding to that cached image.

articles/machine-learning/how-to-debug-pipelines.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.subservice: core
88
ms.topic: conceptual
99
author: likebupt
1010
ms.author: keli19
11-
ms.date: 12/12/2019
11+
ms.date: 03/18/2020
1212
---
1313

1414
# Debug and troubleshoot machine learning pipelines
@@ -76,7 +76,7 @@ The following table contains common problems during pipeline development, with p
7676
| Problem | Possible solution |
7777
|--|--|
7878
| Unable to pass data to `PipelineData` directory | Ensure you have created a directory in the script that corresponds to where your pipeline expects the step output data. In most cases, an input argument will define the output directory, and then you create the directory explicitly. Use `os.makedirs(args.output_dir, exist_ok=True)` to create the output directory. See the [tutorial](tutorial-pipeline-batch-scoring-classification.md#write-a-scoring-script) for a scoring script example that shows this design pattern. |
79-
| Dependency bugs | If you have developed and tested scripts locally but find dependency issues when running on a remote compute in the pipeline, ensure your compute environment dependencies and versions match your test environment. |
79+
| Dependency bugs | If you have developed and tested scripts locally but find dependency issues when running on a remote compute in the pipeline, ensure your compute environment dependencies and versions match your test environment. (See [Environment building, caching, and reuse](https://docs.microsoft.com/azure/machine-learning/concept-environments#environment-building-caching-and-reuse)|
8080
| Ambiguous errors with compute targets | Deleting and re-creating compute targets can solve certain issues with compute targets. |
8181
| Pipeline not reusing steps | Step reuse is enabled by default, but ensure you haven't disabled it in a pipeline step. If reuse is disabled, the `allow_reuse` parameter in the step will be set to `False`. |
8282
| Pipeline is rerunning unnecessarily | To ensure that steps only rerun when their underlying data or scripts change, decouple your directories for each step. If you use the same source directory for multiple steps, you may experience unnecessary reruns. Use the `source_directory` parameter on a pipeline step object to point to your isolated directory for that step, and ensure you aren't using the same `source_directory` path for multiple steps. |

articles/machine-learning/how-to-use-environments.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.reviewer: nibaccam
99
ms.service: machine-learning
1010
ms.subservice: core
1111
ms.topic: conceptual
12-
ms.date: 02/27/2020
12+
ms.date: 03/18/2020
1313

1414
## As a developer, I need to configure my experiment context with the necessary software packages so my machine learning models can be trained and deployed on different compute targets.
1515

@@ -155,6 +155,9 @@ conda_dep.add_conda_package("scikit-learn==0.21.3")
155155
myenv.python.conda_dependencies=conda_dep
156156
```
157157

158+
>[!IMPORTANT]
159+
> If you use the same environment definition for another run, the Azure Machine Learning service reuses the cached image of your environment. If you create an environment with an unpinned package dependency, for example ```numpy```, that environment will keep using the package version installed _at the time of environment creation_. Also, any future environment with matching definition will keep using the old version. For more information, see [Environment building, caching, and reuse](https://docs.microsoft.com/azure/machine-learning/concept-environments#environment-building-caching-and-reuse).
160+
158161
### Private wheel files
159162

160163
You can use private pip wheel files by first uploading them to your workspace storage. You upload them by using a static [`add_private_pip_wheel()`](https://docs.microsoft.com/python/api/azureml-core/azureml.core.environment.environment?view=azure-ml-py#add-private-pip-wheel-workspace--file-path--exist-ok-false-) method. Then you capture the storage URL and pass the URL to the `add_pip_package()` method.

0 commit comments

Comments
 (0)