A methodology for identifying and ranking open source software projects in terms of criticality.
Contents
Using the OpenSSF Criticality Scores (hereafter, OpenSSF CS) as a foundation, develop a scoring system that ranks open source software projects in terms of "criticality" or "importance". The system ought to
- Identify a set of "top projects" and provide relative criticality ranks for this set
- Address some limitations identified by the OpenSSF CS
- Reduce the prevalence of "false positives"
- Use a set of signals that be consistently measured across projects
- Fine-tune to the scoring function and signal weights
- Be easily deployed
- Be easily iterated upon to improve performance
Definition: An open source project is "critical" if it is either directly relied on by many users or indirectly through other widely-used projects that depend on it.
Our approach consists of three main steps:
The universe of open source projects is too broad, so we need to pare this down a bit before even beginning to rank. Let's define the unit of analysis for an open source project as the project's canonical source code repository. Our approach to project discovery and filtering is as follows:
- Use git-based projects with public source repositories on GitHub
- Use GitHub GraphQL to return a list of the top 10,000 most-starred public repositories older than 6 months. We additionally include the OpenSSF Securing Critical Projects list of "most critical projects" in this set.
- Filter this set to exclude forks, mirrors, templates, and archived repositories.
For each project in the set of projects, we next gather a number of quantitative signals that can be derived from project characteristics. Signals should measure something about the project that reasonably correlates with a notion of criticality.
Examples:
- Project age
- Development and release frequency
- Number of distinct individual contributors
- Number of distinct contribution organizations
- Level of project discourse (contributor communication, responsiveness to issues)
- Merge/pull requests, forks, artifact downloads
- Downstream dependents
- Scope of use (who and how is the project being used?)
Some of these signals are easier to measure than others. For example, project age can be easily measured by looking at the date of the first commit in the project's version control system (VCS). On the other hand, scope of use is much harder to measure, as it requires knowledge of who is using the project and how they are using it.
If signals are measured inconsistently across projects, including the signal in a criticality score can bias rankings. For example, if we were to use "number of stars" as a signal, this would be problematic because not all projects are hosted on GitHub. Even among those that are, the number of stars can vary widely depending on the project's popularity and the community's engagement. A concern with the OpenSSF CS data is that some signals like dependency information and issue activity are not consistently measured across projects.
We focus on signals that can be consistently measured across all projects in our set. The signals we have chosen so far are based purely on the version control log of the open source project. Our hope is to expand beyond this constaint in future iterations.
Project criticality rankings can be determined by a composite score derived from the project signals. The ideal score should be a near perfect correlate of criticality: the higher the score, the more "critical" the project is. For each project
Note that the function
where
Term | Description |
---|---|
|
|
Relative weights for signal |
Under these assumptions, the score is a weighted composite of signal ranks.
Weight |
Signal |
Description |
---|---|---|
40% | Distinct contributors | Numer of distinct committers |
20% | Distinct organizations | Number of distinct organizations contributing to the project, as determined by email domain |
10% | Project size | Count of source lines of code in the project |
10% | Last updated | Months since the last commit to the project |
10% | Project age | Months since the first commit to the project |
10% | Commit frequency | Count of commits within the last year |
Some remarks on the choice of signals and parameters:
- By placing heavy weight on distinct contributors, we are prioritizing projects that have a large number of individual contributors, which is a strong indicator of community engagement and project health.
- We also weight distinct organizations, which helps to identify projects that have backing from multiple entities, indicating a broader support base.
- Source lines of code is a measure of project size, which can be an indicator of complexity and legacy of the codebase. Smaller projects that can be easily replaced should not be considered as critical.
- We intentionally avoid signals like issue activity as we find that they can be quite noisy, inconsistently measured across projects, and not necessarily indicative of criticality.
- VCS and DVCS platform-specific (git and GitHub) - Coming up with a set of signals that can be consistently measured across different platforms is a challenge. For example, using platform-specific signals like "number of forks" or "number of stars" is not possible for projects that are not hosted on GitHub. To accommodate this, we can use a set of signals that are more general and can be measured across different platforms. The signals we've chosen so far are based purely on the version control log of the open source project.
- Dependency information is not included - A related issue of observability is that dependency information (which projects rely on others) can be hard to consistently gather. While SBOMs or packaging manifests do a decent job of recording runtime dependencies and other external artifacts a project relies upon, nuanced and potentially more important dependency relationships are imperfectly observed. For example, container technologies like Docker and Kubernetes rely heavily on the Linux kernel, but neither directly "declare" the kernel as an import in packaging manifest.
- Improved filtering (repository classification, NLP, etc.)
- Use a wider set of signals: move beyond VCS logs in some scalable way
- Extension: implement a user feedback loop
- Extension: incorporate dependency information and recursive scoring
- Extension: define alternative parameter sets for different use cases (e.g. security, widespread use, social impact, criticality for a specific user or organization, etc.)
- https://openssf.org/projects/criticality-score/
- https://github.com/ossf/criticality_score
- https://github.com/ossf/wg-securing-critical-projects
- https://openssf.org/blog/2023/07/28/understanding-and-applying-the-openssf-criticality-score-in-open-source-projects/
- https://openssf.org/blog/2022/12/08/apples-and-apples-comparing-approaches-to-measuring-criticality-and-risk-at-the-openssf/
- https://todogroup.org/resources/guides/measuring-your-open-source-programs-success/#what-to-track
- https://chaoss.community/kbtopic/all-metrics-models/