Skip to content
137 changes: 137 additions & 0 deletions incident-response-plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# Incident Response Plan

## Purpose
This document outlines the process for handling security-impacting incidents affecting one or more projects in the Foundation’s ecosystem. Incidents may include platform changes with security implications, account compromise, or other security events that require coordinated response.

The Foundation’s role is to:
1. **Receive and triage reports**
2. **Connect reporters and affected maintainers with the right experts**
3. **Facilitate coordinated response** across multiple projects when needed
4. **Communicate clearly and act as the contact point** while respecting confidentiality and responsible disclosure principles

---

## Scope

This plan covers incidents such as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be in order of most often occurring to least likely to occur

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the idea, but not sure what is the best order... feel free to open a suggestion 👍

- **Platform changes or providers outages** (e.g., GitHub UI change) that create security or operational risk.
- **Account or registry access issues** (e.g., npm lockdown, compromised maintainer account).
- **Supply chain attacks** (e.g. XZ, phishing campaign, etc.. )
- **Third-party service compromises** affecting Foundation projects (e.g., data leaks in external services)
- **Legal-related operational threats**, including:
- License disputes (e.g., GPL/MIT compliance challenges)
- Patent-related threats impacting project distribution
- DMCA takedown requests
- Trademark misuse or brand impersonation


Incidents that are not in scope:
- **Code-level security vulnerabilities** in projects maintained within the Foundation (handled by the project or the OpenJS CNA Team)
- **Non-Foundation projects** — see the [list of supported projects](https://openjsf.org/projects)

---

### Incident Categories

- 🍿 @Discussion: Probably we can think on more scenarios together


| Category | Examples | Primary Response Role |
|----------|----------|-----------------------|
| **Vulnerability Report** | Code exploit, CVE disputes, escalations... | Redirect to the project or delegate to the CNA Team |
| **Platform Change Risk** | GitHub UI update causing accidental info exposure | Triage → Escalate to platform contacts → Provide mitigations |
| **Account Access Issue** | npm account lockout, GitHub MFA issues | Triage → Help restore access via platform → Provide temporary mitigation |
| **Supply Chain Attack** | Malicious dependency version | Coordinate with affected projects → Security advisories |
| **External Incident Impact** | Cloud provider compromise, service outage | Facilitate communication between impacted maintainers and providers |

---


## Action plan

We may not directly solve incidents, but we help **unblock situations** and **support projects at risk**.

### Roles & Responsibilities

- 🍿 @Discussion: who should be in the team?
- 🍿 @Discussion: what is the right name for the team ("THE TEAM")?
- 🍿 @Discussion: what is the right name for the report ("THE REPORT")?
- 🍿 @Discussion: Should we publish the learning/findings when possible publickly to help the community?

#### Reporter

This person submits A REPORT to THE TEAM and provides detailed information about the incident.

**Responsibilities**

- Submit A REPORT to THE TEAM.

**Expectations**

- Provide detailed information about the suspected vulnerability.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have to be specific on the details that are provided in order to have a "quality" report that avoids a lot of clarifications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that big part of it can be "shaped" in the web form (required fields, validation...)

- Follow responsible disclosure guidelines (adapted to this context).
- Cooperate with THE TEAM by providing additional details when needed.
- Respect security timelines and avoid premature public disclosure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In today's meeting folks were discussing the reality of having a coordinator assigned to an issue in volunteer and multi timezone based responses, with the fear that it reads as an On Call assignment.

I want to clarify that in my experience, the goal of having a coordinator (or Directly Responsible Individual) is to ensure that there is at most one perosn executing the duties of the role at a time. Any number of folks can help said coordinator, but responsibilities and communication should flow through the coordinator so efforts aren’t duplicated and others can stay focused.

That role can (and should) be handed off as needed, so long as handoff happens explicitly.

If 2 folks are responding to a new incident, the first step in formalizing the response would be to explicitly identify who among them will take on the coordinator role for the time being.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, though, timezone-of-residence doesn't determine availability. Everyone's sleep and work schedule is different, unrelated to timezone :-)



#### Coordinator (SRC)

This person acts as the focal point for a specific REPORT and ensures the report follows all responsible disclosure guidelines. The SRC coordinates the remediation process if the situation is confirmed and ensures that THE REPORT follows the process and necessary actions are taken. While the SRC is not necessarily responsible for performing a detailed analysis or remediation.

**Responsibilities**

- Acknowledge receipt of REPORTS within the required timeframe.
- Orchestrate the embargo and identify the minimum set of individuals involved.
- Remind everyone involved that they must not notify/involve any other individuals. If someone else needs to be involved, that must go through the Coordinator.
- Assign one or multiple SMEs.
- Ensure communication with the reporter and the affected projects throughout the process.
- Track all THE REPORTS for visibility and reporting.

#### Subject Matter Expert (SME)
Experts brought in for technical insight, platform liaison work, or domain-specific advice.

**Responsibilities**:
- Provide expert input to help assess impact and options
- Advise on mitigation strategies
- Help unblock the situations when feasible

### Reporting methods

- 🍿 @Discussion: what is the best option?
- Dedicated email alias?
- Secure web form?


## Runbook

- 🍿 @Discussion: What is the best approach? Some ideas:
1. **REPORT Received**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a more straightforward workflow in the event of a severe report.

For instance, if the report is from a Low ~ High vulnerability score, we follow the current runbook. However, we should have a direct line if the vulnerability is a potential Critical/Severe score.

2. **Assign Coordinator** and consolidate report details
3. **Review** severity and affected projects
4. **Identify SMEs** and brief them
5. **Coordinate** with projects, platforms, or third parties
6. **Document** findings and lessons learned
7. **Publish** partial or full summary if appropriate
8. **Social Media Team** prepare and posts where needed

## General Response Workflow

- 🍿 @Discussion: early-stage idea, based on the Runbook:

```mermaid
flowchart TD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should decide how communication and work here is co-ordinated. For example a common practice is that when an incident occurs, the incident commander creates a dedicated slack channel to facilitate communication.

A[REPORT Received] --> B[Assign Coordinator]
B --> C{Is valid, qualified and can be verified?}
C -- No --> D[Request Clarification from Reporter]
D --> C
C -- Yes --> E[Assess Impact and Severity]
E --> F{Single Project or Multi-Project?}
F -- Single --> G[Engage Project Maintainers]
F -- Multi --> H[Engage Multiple Maintainers + Foundation Network]
G --> I[Coordinate Response: Bring SMEs...]
H --> I
I --> J[Update Reporter and Stakeholders]
J --> K[Document and Close Incident]