Skip to content

Commit fe41c38

Browse files
authored
Merge pull request #107802 from gfitzgerald42/patch-2
Provide additional information on vulnerabilities
2 parents 92c92fd + d5f2db1 commit fe41c38

File tree

1 file changed

+59
-17
lines changed

1 file changed

+59
-17
lines changed

articles/machine-learning/how-to-troubleshoot-environments.md

Lines changed: 59 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ ms.topic: troubleshooting
1313
ms.custom: devx-track-python, event-tier1-build-2022, ignite-2022
1414
---
1515

16-
# Troubleshooting environment image builds using troubleshooting log error messages
16+
# Troubleshooting environment issues
1717

18-
In this article, learn how to troubleshoot common problems you may encounter with environment image builds.
18+
In this article, learn how to troubleshoot common problems you may encounter with environment image builds and learn about AzureML environment vulnerabilities.
1919

2020
We are actively seeking your feedback! If you navigated to this page via your Environment Definition or Build Failure Analysis logs, we'd like to know if the feature was helpful to you, or if you'd like to report a failure scenario that isn't yet covered by our analysis. You can also leave feedback on this documentation. Leave your thoughts [here](https://aka.ms/azureml/environment/log-analysis-feedback).
2121

@@ -54,9 +54,7 @@ Multiple environments with the same definition may result in the same cached ima
5454

5555
Running a training script remotely requires the creation of a Docker image.
5656

57-
### Reproducibility and vulnerabilities
58-
59-
#### *Vulnerabilities*
57+
## Vulnerabilities in AzureML Environments
6058

6159
You can address vulnerabilities by upgrading to a newer version of a dependency (base image, Python package, etc.) or by migrating to a different dependency that satisfies security
6260
requirements. Mitigating vulnerabilities is time consuming and costly since it can require refactoring of code and infrastructure. With the prevalence
@@ -68,38 +66,82 @@ There are some ways to decrease the impact of vulnerabilities:
6866
- Compartmentalize your environment so you can scope and fix issues in one place.
6967
- Understand flagged vulnerabilities and their relevance to your scenario.
7068

71-
#### *Vulnerabilities vs Reproducibility*
69+
### Scan for Vulnerabilities
70+
71+
You can monitor and maintain environment hygiene with [Microsoft Defender for Container Registry](../defender-for-cloud/defender-for-containers-vulnerability-assessment-azure.md) to help scan images for vulnerabilities.
72+
73+
To automate this process based on triggers from Microsoft Defender, see [Automate responses to Microsoft Defender for Cloud triggers](../defender-for-cloud/workflow-automation.md).
74+
75+
### Vulnerabilities vs Reproducibility
7276

7377
Reproducibility is one of the foundations of software development. When you're developing production code, a repeated operation must guarantee the same
7478
result. Mitigating vulnerabilities can disrupt reproducibility by changing dependencies.
7579

7680
Azure Machine Learning's primary focus is to guarantee reproducibility. Environments fall under three categories: curated,
7781
user-managed, and system-managed.
7882

79-
**Curated environments** are pre-created environments that Azure Machine Learning manages and are available by default in every Azure Machine Learning workspace provisioned.
83+
### *Curated Environments*
8084

81-
They contain collections of Python packages and settings to help you get started with various machine learning frameworks. You're meant to use them as is.
82-
These pre-created environments also allow for faster deployment time.
85+
Curated environments are pre-created environments that Azure Machine Learning manages and are available by default in every Azure Machine Learning workspace provisioned. New versions are released by Azure Machine Learning to address vulnerabilities. Whether you use the latest image may be a tradeoff between reproducibility and vulnerability management.
86+
87+
Curated Environments contain collections of Python packages and settings to help you get started with various machine learning frameworks. You're meant to use them as is. These pre-created environments also allow for faster deployment time.
88+
89+
### *User-managed Environments*
8390

84-
In **user-managed environments**, you're responsible for setting up your environment and installing every package that your training script needs on the
91+
In user-managed environments, you're responsible for setting up your environment and installing every package that your training script needs on the
8592
compute target and for model deployment. These types of environments have two subtypes:
8693

8794
- BYOC (bring your own container): the user provides a Docker image to Azure Machine Learning
8895
- Docker build context: Azure Machine Learning materializes the image from the user provided content
8996

90-
Once you install more dependencies on top of a Microsoft-provided image, or bring your own base image, vulnerability
91-
management becomes your responsibility.
97+
Once you install more dependencies on top of a Microsoft-provided image, or bring your own base image, vulnerability management becomes your responsibility.
9298

93-
You use **system-managed environments** when you want conda to manage the Python environment for you. Azure Machine Learning creates a new isolated conda environment by materializing your conda specification on top of a base Docker image. While Azure Machine Learning patches base images with each release, whether you use the
99+
### *System-managed Environments*
100+
101+
You use system-managed environments when you want conda to manage the Python environment for you. Azure Machine Learning creates a new isolated conda environment by materializing your conda specification on top of a base Docker image. While Azure Machine Learning patches base images with each release, whether you use the
94102
latest image may be a tradeoff between reproducibility and vulnerability management. So, it's your responsibility to choose the environment version used
95103
for your jobs or model deployments while using system-managed environments.
96104

105+
### Vulnerabilities: Common Issues
106+
107+
### *Vulnerabilities in Base Docker Images*
108+
109+
System vulnerabilities in an environment are usually introduced from the base image. For example, vulnerabilities marked as "Ubuntu" or "Debian" are from the system level of the environment–the base Docker image. If the base image is from a third-party issuer, please check if the latest version has fixes for the flagged vulnerabilities. Most common sources for the base images in Azure Machine Learning are:
110+
111+
- Microsoft Artifact Registry (MAR) aka Microsoft Container Registry (mcr.microsoft.com).
112+
- Images can be listed from MAR homepage, calling _catalog API, or [/tags/list](https://mcr.microsoft.com/v2/azureml/openmpi4.1.0-ubuntu20.04/tags/list)_
113+
- Source and release notes for training base images from AzureML can be found in [Azure/AzureML-Containers](https://github.com/Azure/AzureML-Containers)
114+
- Nvidia (nvcr.io, or [nvidia's Profile](https://hub.docker.com/u/nvidia/#!))
115+
116+
If the latest version of your base image does not resolve your vulnerabilities, base image vulnerabilities can be addressed by installing versions recommended by a vulnerability scan:
117+
118+
```
119+
apt-get install -y library_name
120+
```
121+
122+
### *Vulnerabilities in Python Packages*
123+
124+
Vulnerabilities can also be from installed Python packages on top of the system-managed base image. These Python-related vulnerabilities should be resolved by updating your Python dependencies. Python (pip) vulnerabilities in the image usually come from user-defined dependencies.
125+
126+
To search for known Python vulnerabilities and solutions please see [GitHub Advisory Database](https://github.com/advisories). To address Python vulnerabilities, update the package to the version that has fixes for the flagged issue:
127+
128+
```
129+
pip install -u my_package=={good.version}
130+
```
131+
132+
If you're using a conda environment, update the reference in the conda dependencies file.
133+
134+
In some cases, Python packages will be automatically installed during conda's setup of your environment on top of a base Docker image. Mitigation steps for those are the same as those for user-introduced packages. Conda installs necessary dependencies for every environment it materializes. Packages like cryptography, setuptools, wheel, etc. will be automatically installed from conda's default channels. There's a known issue with the default anaconda channel missing latest package versions, so it's recommended to prioritize the community-maintained conda-forge channel. Otherwise, please explicitly specify packages and versions, even if you don't reference them in the code you plan to execute on that environment.
135+
136+
### *Cache issues*
137+
97138
Associated to your Azure Machine Learning workspace is an Azure Container Registry instance that's a cache for container images. Any image
98139
materialized is pushed to the container registry and used if you trigger experimentation or deployment for the corresponding environment. Azure
99-
Machine Learning doesn't delete images from your container registry, and it's your responsibility to evaluate which images you need to maintain over time. You
100-
can monitor and maintain environment hygiene with [Microsoft Defender for Container Registry](../defender-for-cloud/defender-for-containers-vulnerability-assessment-azure.md)
101-
to help scan images for vulnerabilities. To
102-
automate this process based on triggers from Microsoft Defender, see [Automate responses to Microsoft Defender for Cloud triggers](../defender-for-cloud/workflow-automation.md).
140+
Machine Learning doesn't delete images from your container registry, and it's your responsibility to evaluate which images you need to maintain over time.
141+
142+
## Troubleshooting environment image builds
143+
144+
Learn how to troubleshoot issues with environment image builds and package installations.
103145

104146
## **Environment definition problems**
105147

0 commit comments

Comments
 (0)