-
Notifications
You must be signed in to change notification settings - Fork 16
Add vulnerability management recommendations for maintainers #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -94,3 +94,70 @@ updates: | |||||||||
| And [here's | ||||||||||
| one](https://github.com/open-telemetry/opentelemetry-operator/pull/2990) from | ||||||||||
| Dependabot. | ||||||||||
|
|
||||||||||
| ## Vulnerability Management | ||||||||||
|
|
||||||||||
| There are several aspects of vulnerability management that the Security SIG | ||||||||||
| recommends to the OpenTelemetry project maintainers: | ||||||||||
|
|
||||||||||
| - The maintainers of an OpenTelemetry project should establish a clear | ||||||||||
| accountability for security issues, including identifying the direct | ||||||||||
| responsible individual for security issues at a certain time, for | ||||||||||
| example, via a duty rotation. This should be | ||||||||||
| documented in the main README.md file of the project. | ||||||||||
| - The direct responsible individual should monitor the [repository security | ||||||||||
| advisories](https://docs.github.com/code-security/security-advisories/working-with-repository-security-advisories/about-repository-security-advisories), | ||||||||||
| make sure security advisories are triaged in a timely manner, and there is | ||||||||||
| active engagement and communication on the issue. Refer to the [incident | ||||||||||
| response guideline](../security-response.md#incident-response) for more | ||||||||||
| details. | ||||||||||
| - Regularly scan your code and dependencies for **deprecations** and | ||||||||||
| **vulnerabilities** using tools. This should include but not be limited to: | ||||||||||
| - CI/CD tooling - for example, some GitHub Actions might be deprecated or no | ||||||||||
| longer maintained, certain GitHub Actions might have known vulnerabilities, | ||||||||||
| a compiler might have a known vulnerability, etc. | ||||||||||
| - CI/CD environment - for example, the CI/CD job might be running on a | ||||||||||
| deprecated or vulnerable version of the operating system. | ||||||||||
| - container image dependencies - for example, the base image used in | ||||||||||
| Dockerfiles or image referenced by Helm charts might have known | ||||||||||
| vulnerabilities, or the image might be using a deprecated version of a | ||||||||||
| library. | ||||||||||
| - repository configurations - for example, a hotfix branch might not have the | ||||||||||
| proper branch protection rules, or the repository might not have the proper | ||||||||||
| security settings enabled. | ||||||||||
|
Comment on lines
+125
to
+127
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be mostly covered by org-wide config-as-code tooling by now but we don't cover custom branches beyond main there usually, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. |
||||||||||
| - package dependencies - for example, a package might have a known | ||||||||||
| vulnerability, or a package might be using a deprecated version of a | ||||||||||
| library. | ||||||||||
|
Comment on lines
+128
to
+130
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might want to put that one to the top of the list as it's the most frequent concern that's brought up. |
||||||||||
| - All security vulnerabilities that are found - whether from the user reported | ||||||||||
| repository security advisories or through automated scanning - should be | ||||||||||
| handled in a timely manner based on the severity level. In case of a real | ||||||||||
| vulnerability that doesn't have a fix available, the maintainers should | ||||||||||
| evaluate the impact and likelihood of exploitation and take appropriate | ||||||||||
| action, such as applying workarounds or communicating with affected users. In | ||||||||||
| the worst case, this can be a potential end-of-life announcement of the | ||||||||||
| affected component or project. | ||||||||||
|
Comment on lines
+137
to
+138
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "or project" made me think catastrophic end of OpenTelemetry
Suggested change
|
||||||||||
|
|
||||||||||
| Here is a check list for the maintainers: | ||||||||||
|
|
||||||||||
| - [ ] Identify the direct responsible individual for security issues and | ||||||||||
| document it in the main README.md file of the project. | ||||||||||
| - [ ] Monitor the GitHub repository security advisories and triage security | ||||||||||
| issues in a timely manner. | ||||||||||
| - [ ] Daily scan CI/CD tooling for deprecations and vulnerabilities. | ||||||||||
| - [ ] Daily scan CI/CD environment for deprecations and vulnerabilities. | ||||||||||
| - [ ] Daily scan container image dependencies for deprecations and | ||||||||||
| vulnerabilities. | ||||||||||
| - [ ] Daily scan repository configurations for deprecations and vulnerabilities. | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this something maintainers can do? |
||||||||||
| - [ ] Daily scan package dependencies for deprecations and vulnerabilities. | ||||||||||
| - [ ] False positives are documented (e.g., by commenting on the security | ||||||||||
| advisory, by providing the dismissal reason to a code scanning alert) and | ||||||||||
| dismissed. | ||||||||||
| - [ ] Critical and high severity vulnerabilities are patched within 2 weeks of | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure about wording
Suggested change
|
||||||||||
| discovery. | ||||||||||
| - [ ] Medium and low severity vulnerabilities are patched within 4 weeks of | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not sure about wording
Suggested change
|
||||||||||
| discovery. | ||||||||||
|
Comment on lines
+155
to
+158
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While it's good to document an SLA here, I think it should be defined more clearly rather than just in the checklist that's summarizing the above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have the following wording in the Collector SIG in case it helps:
Comment on lines
+155
to
+158
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think these are a bit too aggressive. You can see e.g. these ones from Gitlab: https://handbook.gitlab.com/handbook/security/product-security/vulnerability-management/sla/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The OpenSSF questionnaire says
I would go with that maybe at least for the core repos There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I feel this is a low bar and we should put a higher one. Here is my thinking: OpenTelemetry components are widely used by other OSS components, if our bar is to get the medium or higher CVEs patched in 60 days, we are not giving sufficient time for our users (including other OSS software that depend on us) to meet their 60 days time. That's the reason why I put 4 weeks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a higher bar makes sense to some but not all OpenTelemetry components necessarily (e.g. I don't think a security vulnerability in, say, Weaver, is as problematic as one in opentelemetry-python: they have different use cases with different security implications and their usage is different). I am in favor of a stronger requirement in specific components, but since this is an universal standard I think we should put a lower bar here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense to me. Do you feel this is a good balance? Core components: 30 days There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds reasonable to me |
||||||||||
| - [ ] For vulnerabilities that cannot be patched in a timely manner (for | ||||||||||
| example, the component is depending on an outdated library, and there is no | ||||||||||
| replacement), the maintainers should evaluate the impact and likelihood of | ||||||||||
| exploitation and take appropriate action, such as applying workarounds or | ||||||||||
| communicating with affected users. | ||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels a bit heavyweight for most of the repos to establish a DRI rotation where there are 2-3 maintainers
as long as the maintainers are meeting the bar we are setting for triage / fixing
if we want to keep this, maybe "... or some other process that ensures vulnerabilities are being triaged in a timely manner"