Skip to content

Commit e7eb068

Browse files
authored
Clean up repo analysis, refactor, and add more tools (#6963)
Updating script to include more repo analysis scripts and organize them under a subdir since there will be even more scripts coming over time These are tools used to do a one-off investigation on org repos, but checking them in in case they turn out to be more widely useful
1 parent 140dc0d commit e7eb068

File tree

9 files changed

+2576
-182
lines changed

9 files changed

+2576
-182
lines changed

tools/analytics/org/.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Stores cached data for GitHub API responses
2+
cache/
3+
4+
# Gets temporarily created by the script
5+
scale-config.yml
6+
7+
# Stores the output of the analysis
8+
reports/

tools/analytics/org/README.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Organization Analytics Tools
2+
3+
This directory contains a collection of scripts designed to analyze GitHub Actions runner usage and other organizational metrics across a GitHub organization's repositories.
4+
5+
## Overview
6+
7+
The tools in this directory help us understand how GitHub Actions runners are being utilized across our repositories.
8+
9+
## Scripts
10+
11+
### `analyze_runner_usage.py`
12+
13+
**Purpose**: Analyzes GitHub Actions runner label usage across all repositories in a specified GitHub organization.
14+
15+
**Key Features**:
16+
- Fetches all non-archived repositories in a GitHub organization
17+
- Extracts runner labels used in workflow jobs from recent workflow runs
18+
- Aggregates runner usage statistics across repositories
19+
- Compares runner labels against those defined in `scale-config.yml` and standard GitHub-hosted runners
20+
- Identifies unused or undefined runners
21+
- Generates comprehensive usage reports
22+
23+
**Output**: Creates `runner_labels_summary.yml` with detailed analytics including:
24+
- Runner usage by repository
25+
- Repository usage by runner type
26+
- Repositories with zero workflow runs
27+
- Runners not defined in scale-config or standard GitHub runners
28+
- Usage patterns and trends
29+
30+
### `cache_manager.py`
31+
32+
**Purpose**: Helper script. Provides efficient caching functionality for GitHub API responses to optimize performance and avoid rate limiting.
33+
34+
**Features**:
35+
- URL-based cache key generation
36+
- Intelligent cache invalidation
37+
- Rate limit optimization
38+
- Reduces redundant API calls during analysis

0 commit comments

Comments
 (0)