Skip to content

Commit 2b53c9b

Browse files
authored
Merge pull request #34 from kalverra/docs
Allow Collecting Currently Running Workflows
2 parents 4fcc159 + 513eff1 commit 2b53c9b

File tree

21 files changed

+505
-119
lines changed

21 files changed

+505
-119
lines changed

AGENTS.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@
44

55
Octometrics is a Go CLI that profiles GitHub Actions workflows. Read `design.md` for architecture diagrams and key design decisions. The main commands are:
66

7-
| Command | Purpose |
8-
|---------|---------|
9-
| `monitor` | Collects system metrics (CPU, memory, disk, I/O) during a GHA job, writes JSONL |
10-
| `gather` | Fetches workflow/job/step data from the GitHub REST & GraphQL APIs, stores as JSON |
11-
| `observe` | Renders gathered data as interactive HTML (Mermaid Gantt charts, Plotly metric charts) |
12-
| `report` | Analyzes monitor JSONL and posts Mermaid-based summaries to GHA step summaries and PR comments |
7+
| Command | Purpose |
8+
| --------- | ---------------------------------------------------------------------------------------------- |
9+
| `monitor` | Collects system metrics (CPU, memory, disk, I/O) during a GHA job, writes JSONL |
10+
| `gather` | Fetches workflow/job/step data from the GitHub REST & GraphQL APIs, stores as JSON |
11+
| `observe` | Renders gathered data as interactive HTML (Mermaid Gantt charts, Plotly metric charts) |
12+
| `report` | Analyzes monitor JSONL and posts Mermaid-based summaries to GHA step summaries and PR comments |
1313

1414
Key packages: `cmd/` (Cobra CLI), `monitor/` (system metrics), `gather/` (GitHub API), `observe/` (HTML visualization), `report/` (in-action reporting), `internal/config/` (Viper config), `logging/` (zerolog setup).
1515

@@ -29,6 +29,7 @@ Analyze the outputs and fix issues you introduced. **Do not change a test unless
2929
- Tests use `github.com/stretchr/testify` (`require` for fatal checks, `assert` for non-fatal).
3030
- Use the `internal/testhelpers.Setup(t)` helper to create a temp directory and logger for tests. It auto-cleans on success and preserves on failure.
3131
- Test data goes in `<package>/testdata/` directories.
32+
- You can run `pre-commit` using the `.pre-commit-config.yaml` file for extensive checks.
3233

3334
## Coding Conventions
3435

@@ -49,11 +50,13 @@ Provide a **Risk Rating** at the top of the review summary:
4950
- **LOW:** Documentation, styling, minor bug fixes in non-critical paths, or boilerplate.
5051

5152
### 2. Targeted Review Areas
52-
Identify specific code blocks that require **scrupulous human review**. Focus on:
53+
Identify and call out specific code blocks that require **scrupulous human review**. Focus on:
5354
- Complex conditional logic or concurrency-prone areas.
5455
- Potential breaking changes in internal or external APIs.
5556
- Logic that lacks sufficient unit test coverage within the PR.
5657

58+
If you find any, list them and give a brief description of why they deserve extra attention.
59+
5760
### 3. Reviewer Recommendations
5861
Analyze the git history (recent editors) to suggest the most qualified reviewers.
5962
- Prioritize individuals who have made significant recent contributions to the specific files modified.

README.md

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,33 @@
22

33
A simple CLI tool to visualize and profile your GitHub Actions workflows. See all the processes that run as part of a PR, workflow, or job in a simple, interactive chart. It can also run [directly in your GitHub Actions flow](https://github.com/kalverra/octometrics-action), useful for debugging changes and performance issues.
44

5-
![Example PR run](example.png)
5+
![Example PR run](./pr-example.png)
66

77
## Run
88

9+
Before running, make sure to provide GitHub API token, either through the `GITHUB_TOKEN` env var, or the `-t` flag.
10+
911
```sh
12+
# Install
13+
go install github.com/kalverra/octometrics@latest
14+
1015
# Show help menu
11-
go run . -h
12-
```
16+
octometrics -h
1317

14-
## Monitor
18+
# To see all workflows run on all commits a part of this PR (including merge queue runs): https://github.com/kalverra/octometrics/pull/33
19+
octometrics gather -o kalverra -r octometrics -p 33
1520

16-
This will launch a background process to monitor stats like CPU and memory usage. This can be run on GHA runners so that when you later `gather` and `observe` the data, you will also have detailed profiling info.
21+
# To see all workflows run on a specific commit: https://github.com/kalverra/octometrics/pull/33/changes/94ad3f7e2f45852a99791326847ea12c94b964dc
22+
octometrics gather -o kalverra -r octometrics -c 94ad3f7e2f45852a99791326847ea12c94b964dc
1723

18-
```sh
19-
go run . monitor
24+
# To see a specific workflow run: https://github.com/kalverra/octometrics/actions/runs/22918636165
25+
octometrics gather -o kalverra -r octometrics -w 22918636165
26+
27+
# Use '-u' to force update local data if it already exists
28+
octometrics gather -o kalverra -r octometrics -p 33 -u
2029
```
2130

22-
### GitHub Action
31+
## GitHub Action
2332

2433
Run `monitor` directly in your GitHub action and it will post performance data as a comment and summary to the action run. [See the octometrics-action](https://github.com/kalverra/octometrics-action).
2534

cmd/gather.go

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,26 @@ var (
1919
var gatherCmd = &cobra.Command{
2020
Use: "gather",
2121
Short: "Gather metrics from GitHub",
22+
Long: `Gather metrics from GitHub.
23+
24+
Read workflow runtime data from GitHub to display in the browser.
25+
26+
It can be used to gather data for a specific workflow run, pull request, or commit.
27+
In-progress workflows are supported and will be displayed with an active status indicator.
28+
`,
29+
Example: `
30+
# To see all workflows run on all commits a part of this PR (including merge queue runs): https://github.com/kalverra/octometrics/pull/33
31+
octometrics gather -o kalverra -r octometrics -p 33
32+
33+
# To see all workflows run on a specific commit: https://github.com/kalverra/octometrics/pull/33/changes/94ad3f7e2f45852a99791326847ea12c94b964dc
34+
octometrics gather -o kalverra -r octometrics -c 94ad3f7e2f45852a99791326847ea12c94b964dc
35+
36+
# To see a specific workflow run: https://github.com/kalverra/octometrics/actions/runs/22918636165
37+
octometrics gather -o kalverra -r octometrics -w 22918636165
38+
39+
# Use '-u' to force update local data if it already exists
40+
octometrics gather -o kalverra -r octometrics -p 33 -u
41+
`,
2242
PreRunE: func(_ *cobra.Command, _ []string) error {
2343
if err := cfg.ValidateGather(); err != nil {
2444
return err

cmd/monitor.go

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,16 @@ var (
2424
var monitorCmd = &cobra.Command{
2525
Use: "monitor",
2626
Short: "Monitor system resources",
27-
Long: "Monitor system resources and save the data to a file for later analysis.",
27+
Long: `Monitor system resources for later analysis.
28+
29+
This command will monitor system resources like CPU, memory, disk, and I/O during a GHA job.
30+
31+
It will write the data to a file for later analysis. Primarily used in the octometrics-action to monitor system resources during a GHA job.`,
32+
Example: `
33+
octometrics monitor # Monitor system resources until interrupted
34+
octometrics monitor --duration=1h # Monitor system resources for 1 hour
35+
octometrics monitor --interval=5s # Monitor system resources every 5 seconds
36+
`,
2837
RunE: func(_ *cobra.Command, _ []string) error {
2938
var (
3039
ctx context.Context

cmd/observe.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ import (
1313
var observeCmd = &cobra.Command{
1414
Use: "observe",
1515
Short: "Observe metrics from GitHub",
16+
Long: `Observe metrics from GitHub.
17+
18+
Display the gathered Workflow/Job/Step data in your browser.`,
19+
Example: `octometrics observe # Display all of your gathered Workflow/Job/Step data in your browser`,
1620
PreRunE: func(_ *cobra.Command, _ []string) error {
1721
var err error
1822
githubClient, err = gather.NewGitHubClient(logger, cfg.GitHubToken, nil)

example.png

-137 KB
Binary file not shown.

gather/commit.go

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -290,26 +290,25 @@ func setWorkflowRunsForCommit(
290290
)
291291

292292
for _, checkRun := range checkRuns {
293-
if checkRun.GetStatus() == "completed" {
294-
match := workflowRunIDRe.FindStringSubmatch(checkRun.GetHTMLURL())
295-
if len(match) == 0 {
296-
log.Warn().
297-
Str("owner", owner).
298-
Str("repo", repo).
299-
Str("SHA", commitData.GetSHA()).
300-
Str("check_run", checkRun.GetName()).
301-
Str("URL", checkRun.GetHTMLURL()).
302-
Msg("Failed to parse workflow run ID from check run URL")
303-
continue
304-
}
305-
workflowRunID, err := strconv.ParseInt(match[1], 10, 64)
306-
if err != nil {
307-
return fmt.Errorf("failed to parse workflow run ID from check run URL: %w", err)
308-
}
309-
workflowRunIDsSet[workflowRunID] = struct{}{}
310-
} else {
311-
log.Warn().Str("Check Run", checkRun.GetName()).Msg("Check run is not yet completed, skipping")
293+
if checkRun.GetStatus() != "completed" {
294+
log.Warn().Str("check_run", checkRun.GetName()).Msg("Check run is not yet completed")
295+
}
296+
match := workflowRunIDRe.FindStringSubmatch(checkRun.GetHTMLURL())
297+
if len(match) == 0 {
298+
log.Warn().
299+
Str("owner", owner).
300+
Str("repo", repo).
301+
Str("SHA", commitData.GetSHA()).
302+
Str("check_run", checkRun.GetName()).
303+
Str("URL", checkRun.GetHTMLURL()).
304+
Msg("Failed to parse workflow run ID from check run URL")
305+
continue
312306
}
307+
workflowRunID, err := strconv.ParseInt(match[1], 10, 64)
308+
if err != nil {
309+
return fmt.Errorf("failed to parse workflow run ID from check run URL: %w", err)
310+
}
311+
workflowRunIDsSet[workflowRunID] = struct{}{}
313312
}
314313

315314
// Pass commit data down to the workflow run
@@ -323,7 +322,11 @@ func setWorkflowRunsForCommit(
323322
}
324323
commitData.comparisonMutex.Lock()
325324
defer commitData.comparisonMutex.Unlock()
326-
commitData.Conclusion = establishPRChecksConclusion(commitData.Conclusion, workflowRun.GetConclusion())
325+
conclusion := workflowRun.GetConclusion()
326+
if conclusion == "" {
327+
conclusion = workflowRun.GetStatus()
328+
}
329+
commitData.Conclusion = establishPRChecksConclusion(commitData.Conclusion, conclusion)
327330
commitData.Cost += workflowRun.GetCost()
328331
if workflowRun.GetRunStartedAt().Before(commitData.StartActionsTime) ||
329332
commitData.StartActionsTime.IsZero() {

gather/workflow_run.go

Lines changed: 47 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,9 @@ func (w *WorkflowRunData) GetUsage() *github.WorkflowRunUsage {
125125
return w.Usage
126126
}
127127

128-
// WorkflowRun gathers all metrics for a completed workflow run
128+
// WorkflowRun gathers all metrics for a workflow run.
129+
// In-progress runs are supported: billing and monitoring data are skipped,
130+
// and the result is not cached locally so fresh data is always fetched.
129131
func WorkflowRun(
130132
log zerolog.Logger,
131133
client *GitHubClient,
@@ -207,8 +209,11 @@ func WorkflowRun(
207209
if workflowRun == nil {
208210
return nil, "", fmt.Errorf("workflow run '%d' not found on GitHub", workflowRunID)
209211
}
210-
if workflowRun.GetStatus() != "completed" {
211-
return nil, "", fmt.Errorf("workflow run '%d' is still in progress", workflowRunID)
212+
completed := workflowRun.GetStatus() == "completed"
213+
if !completed {
214+
log.Warn().
215+
Str("status", workflowRun.GetStatus()).
216+
Msg("Workflow run is not yet completed; billing and monitoring data will be unavailable")
212217
}
213218

214219
workflowRunData.WorkflowRun = workflowRun
@@ -220,24 +225,26 @@ func WorkflowRun(
220225
analyses []*monitor.Analysis
221226
)
222227

223-
eg.Go(func() error {
224-
var analysisErr error
225-
analyses, analysisErr = monitoringData(log, client, owner, repo, workflowRunID, targetDir)
226-
return analysisErr
227-
})
228+
if completed {
229+
eg.Go(func() error {
230+
var analysisErr error
231+
analyses, analysisErr = monitoringData(log, client, owner, repo, workflowRunID, targetDir)
232+
return analysisErr
233+
})
234+
235+
eg.Go(func() error {
236+
var billingErr error
237+
workflowBillingData, billingErr = billingData(client, owner, repo, workflowRunID)
238+
return billingErr
239+
})
240+
}
228241

229242
eg.Go(func() error {
230243
var jobsErr error
231244
workflowRunJobs, jobsErr = jobsData(client, owner, repo, workflowRunID)
232245
return jobsErr
233246
})
234247

235-
eg.Go(func() error {
236-
var billingErr error
237-
workflowBillingData, billingErr = billingData(client, owner, repo, workflowRunID)
238-
return billingErr
239-
})
240-
241248
if err := eg.Wait(); err != nil {
242249
return nil, "", fmt.Errorf(
243250
"failed to collect job, billing, and/or monitoring data for workflow run '%d': %w",
@@ -247,19 +254,26 @@ func WorkflowRun(
247254
}
248255
workflowRunData.Usage = workflowBillingData
249256

250-
// Calculate job cost data and add to workflow run data
251257
for _, job := range workflowRunJobs {
252-
// Calculate completed at for the workflow. GitHub API only gives "UpdatedAt" for workflows
253-
// which can be misleading.
254-
if workflowRunData.RunCompletedAt.IsZero() {
255-
workflowRunData.RunCompletedAt = job.GetCompletedAt().Time
256-
} else if job.GetCompletedAt().After(workflowRunData.RunCompletedAt) {
257-
workflowRunData.RunCompletedAt = job.GetCompletedAt().Time
258+
completedAt := job.GetCompletedAt().Time
259+
if !completedAt.IsZero() {
260+
if workflowRunData.RunCompletedAt.IsZero() {
261+
workflowRunData.RunCompletedAt = completedAt
262+
} else if completedAt.After(workflowRunData.RunCompletedAt) {
263+
workflowRunData.RunCompletedAt = completedAt
264+
}
258265
}
259266

260-
runner, cost, err := calculateJobRunBilling(job.GetID(), workflowBillingData)
261-
if err != nil {
262-
return nil, "", fmt.Errorf("failed to calculate cost for job '%d': %w", job.GetID(), err)
267+
var (
268+
runner string
269+
cost int64
270+
)
271+
if completed {
272+
var billingErr error
273+
runner, cost, billingErr = calculateJobRunBilling(job.GetID(), workflowBillingData)
274+
if billingErr != nil {
275+
return nil, "", fmt.Errorf("failed to calculate cost for job '%d': %w", job.GetID(), billingErr)
276+
}
263277
}
264278
workflowRunData.Cost += cost
265279
workflowRunData.Jobs = append(workflowRunData.Jobs, &JobData{
@@ -269,16 +283,17 @@ func WorkflowRun(
269283
})
270284
}
271285

272-
// Match monitoring data to jobs
273-
nextAnalysisLoop:
274-
for _, analysis := range analyses {
275-
for _, job := range workflowRunData.Jobs {
276-
if analysis.JobName == job.GetName() {
277-
job.Analysis = analysis
278-
continue nextAnalysisLoop
286+
if completed {
287+
nextAnalysisLoop:
288+
for _, analysis := range analyses {
289+
for _, job := range workflowRunData.Jobs {
290+
if analysis.JobName == job.GetName() {
291+
job.Analysis = analysis
292+
continue nextAnalysisLoop
293+
}
279294
}
295+
log.Warn().Str("monitoring_data_job_name", analysis.JobName).Msg("Found monitoring data for job but found no job name matches")
280296
}
281-
log.Warn().Str("monitoring_data_job_name", analysis.JobName).Msg("Found monitoring data for job but found no job name matches")
282297
}
283298

284299
data, err := json.Marshal(workflowRunData)

0 commit comments

Comments
 (0)