Skip to content

Commit 1098dfa

Browse files
author
Eric Sorenson
committed
Update with two graphql queries for metrics
Removed references to us adding new metrics (except contributing) and adds sample graphql queries w/explanations for how to use them. Relates to #1842
1 parent 30f3929 commit 1098dfa

File tree

1 file changed

+108
-17
lines changed

1 file changed

+108
-17
lines changed

docs/open-source-health-metrics.md

Lines changed: 108 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
![Octocat taking parts out of a box](/images/octocat-opening-box.jpeg)
44

5-
Large organizations are increasingly “working in the open”: releasing internal tools and projects as open source. But once they’re out in the world, maintainers often find it difficult to track the health of these projects, especially when there are dozens or hundreds of them. GitHub’s Open Source Program Office (OSPO), in our mission to help other OSPOs and maintainers of independent OSS projects, has added a number of key metrics to our public API that can power dashboards and visualizations which help make sense of the data.
5+
Large organizations are increasingly “working in the open”: releasing internal tools and projects as open source. But once they’re out in the world, maintainers often find it difficult to track the health of these projects, especially when there are dozens or hundreds of them. As part of our mission to help other Open Source Program Offices and maintainers of independent OSS projects, GitHub’s OSPO has improved and documented parts of the GitHub API that can power dashboards and visualizations which help make sense of the data.
66

7-
This document describes how to combine the new metrics with existing information to build a complete picture of your repository health. We'll explore both setting up a popular end-to-end solution and a DIY approach for those who want to build their own.
7+
This document describes how to combine metrics and information from the API to build a complete picture of your repository health. We'll explore both setting up a popular end-to-end solution and a more DIY approach for those who want to build their own.
88

99
## Introducing Cauldron.io
1010

@@ -30,7 +30,7 @@ Here are some suggestions for patterns you might look for as you examine the dat
3030

3131
## Working with the GitHub API
3232

33-
If you're interested in building your own dashboards, you can use the GitHub API to pull the data you need. The [GraphQL API](https://docs.github.com/en/graphql) provides a flexible way to query for data, and is the recommended approach for building dashboards. The [GitHub API Explorer](https://docs.github.com/en/rest/overview/explorer) is a great way to explore the data available from the API and test out queries. Specific to community health, we in the GitHub OSPO have been working on improving the GraphQL API to add metrics which were either not available at all or required some complicated work to retrieve.
33+
If you're interested in building your own dashboards, you can use the GitHub API to pull the data you need. The [GraphQL API](https://docs.github.com/en/graphql) provides a flexible way to query for data, and is the recommended approach for building dashboards. The [GitHub API Explorer](https://docs.github.com/en/rest/overview/explorer) is a great way to explore the data available from the API and test out queries. Specific to community health, we in the GitHub OSPO have pulled together data from disparate sources to make easier dashboarding.
3434

3535
### Community standards
3636

@@ -41,32 +41,123 @@ You can retrieve information about the [community standards documents](https://d
4141
- [CodeOfConduct](https://docs.github.com/en/graphql/reference/objects#codeofconduct)
4242
- [README.md](https://docs.github.com/en/graphql/reference/objects#readme)
4343

44-
### Repository metrics
44+
### Querying GraphQL metrics
4545

46-
The Repository object in the GraphQL API is the primary location for metrics which relate to project health. These metrics are available in the `Repository` object, and while there are lots of interesting fields available, we've recently coalesced the most useful ones under the `metrics` field.
46+
GitHub's GraphQL API has an absolute treasure trove of information about what's going on with your repositories. In fact, the amount and variety of data available can be a bit overwhelming, so we've selected a few key metrics that can provide insights into project and community health. The following GraphQL queries will provide point-in-time data which time-series visualization software like Grafana (see below) can turn into charts that can help identify trends over time.
4747

48-
> **Note**
49-
> As of October 2023, the following metrics are in public beta, behind a feature flag. In order to access them, you will need to add a special header to your requests: `GraphQL-Features: ospo_metrics_api` . While they are in beta, the metrics may change and the official API documentation will not have their descriptions. We will update this document as the metrics are finalized and the feature flag is removed.
48+
- Open and closed issue counts - Track these to look for a backlog in unanswered issues
49+
- Open and closed pull requests, including which were closed without merge - Similarly, a growing backlog of open PRs can indicate maintainer overload. Additionally, a high ratio of PRS which are closed without merge could be spam or low-quality contributions which need additional guidance.
50+
- Date of most recent activity in the repo, including discussions, PRs, issues, commits, and releases - Archiving inactive repos can reduce maintainer burden and allow you to focus on projects which need more attention.
5051

51-
- **LastContributionDate** - The most recent date there was _any of_ the following activity: a commit to a repository’s default branch, opening an issue or discussion, answering a discussion, proposing a pull request, or submitting a pull request review. This is a good single-number metric to find projects that may be unmaintained or in need of archiving.
52-
- **CommitCount** - A monotonically increasing count of the total number of commits pushed to the default branch of the repository. Tracking the change in this over time will give a sense of the overall activity in the repository.
52+
Try these queries in the GraphQL explorer:
5353

54-
### More useful metrics
54+
```graphql
55+
# Raw numbers related to repository activity
56+
query RepositoryMetrics {
57+
repository(owner: "github", name: "docs") {
58+
totalIssueCount: issues {
59+
totalCount
60+
}
61+
openIssueCount: issues(states: [OPEN]) {
62+
totalCount
63+
}
64+
closedIssueCount: issues(states: [CLOSED]) {
65+
totalCount
66+
}
67+
openPullRequestCount: pullRequests(states: [OPEN]) {
68+
totalCount
69+
}
70+
closedPullRequestCount: pullRequests(states: [CLOSED]) {
71+
totalCount
72+
}
73+
mergedPullRequestCount: pullRequests(states: [MERGED]) {
74+
totalCount
75+
}
76+
}
77+
}
78+
```
5579

56-
In addition to the new metrics, there's a lot of useful information tucked away in the existing GraphQL API. Many tools, like the cauldron.io example above, make use of these under the hood, and you might find them helpful in your own dashboards.
80+
The response will look something like:
5781

58-
- `repository(owner:"monalisa",name:"octocat") { issues { totalCount } }` - returns the number of total issues in the repository
59-
- `repository(owner:"monalisa",name:"octocat") { forkcount }` - the total number of forks of this repository in the fork network (i.e. including forks of forks)
82+
```graphql
83+
{
84+
"data": {
85+
"repository": {
86+
"totalIssueCount": { "totalCount": 2991 },
87+
"openIssueCount": { "totalCount": 43 },
88+
"closedIssueCount": { "totalCount": 2948 },
89+
"openPullRequestCount": { "totalCount": 24 },
90+
"closedPullRequestCount": { "totalCount": 5448 },
91+
"mergedPullRequestCount": { "totalCount": 10404 }
92+
}
93+
}
94+
}
95+
```
6096

61-
More complex GraphQL queries are possible as well. For example, this query:
97+
The last activity query may be better suited to periodic audits looking for activity than continuous time-series graphing. It looks like this:
6298

6399
```graphql
64-
repository(owner:"voxpupuli",name:"puppetboard") {
65-
pullRequests(states:OPEN) { totalCount }
100+
query LastActivity {
101+
repository(owner: "github", name: "docs") {
102+
updatedAt
103+
lastestCreatedDiscussion: discussions(
104+
last: 1
105+
orderBy: { field: CREATED_AT, direction: ASC }
106+
) {
107+
nodes {
108+
createdAt
109+
}
110+
}
111+
latestAnsweredDiscussion: discussions(
112+
last: 1
113+
orderBy: { field: UPDATED_AT, direction: ASC }
114+
answered: true
115+
) {
116+
nodes {
117+
updatedAt
118+
}
119+
}
120+
lastestPullRequest: pullRequests(
121+
last: 1
122+
orderBy: { field: CREATED_AT, direction: ASC }
123+
) {
124+
nodes {
125+
createdAt
126+
}
127+
}
128+
lastestIssue: issues(
129+
last: 1
130+
orderBy: { field: CREATED_AT, direction: ASC }
131+
) {
132+
nodes {
133+
createdAt
134+
}
135+
}
136+
lastestCommit: pushedAt
137+
latestRelease {
138+
createdAt
66139
}
140+
}
141+
}
67142
```
68143

69-
Returns the number of open pull requests. Other possible states are `CLOSED` and `MERGED`. Tracking these over time is a key indicator of activity in the repository.
144+
It returns a structure like the following, with output lines joined here for brevity:
145+
146+
```graphql
147+
{
148+
"data": {
149+
"repository": {
150+
"updatedAt": "2023-12-05T23:44:24Z",
151+
"lastestCreatedDiscussion": { "nodes": [ { "createdAt": "2023-11-24T12:18:22Z" } ] },
152+
"latestAnsweredDiscussion": { "nodes": [ { "updatedAt": "2023-11-16T09:13:07Z" } ] },
153+
"lastestPullRequest": { "nodes": [ { "createdAt": "2023-12-05T23:35:24Z" } ] },
154+
"lastestIssue": { "nodes": [ { "createdAt": "2023-12-05T04:20:22Z" } ] },
155+
"lastestCommit": "2023-12-05T23:35:24Z",
156+
"latestRelease": { "createdAt": "2023-02-14T14:35:19Z" }
157+
}
158+
}
159+
}
160+
```
70161

71162
## Other Graphing Options
72163

0 commit comments

Comments
 (0)