Skip to content

chore(6993): Add Historical Comparison to UI Startup Metrics#40237

Open
DDDDDanica wants to merge 6 commits intomainfrom
chore/6993-historical-comparison
Open

chore(6993): Add Historical Comparison to UI Startup Metrics#40237
DDDDDanica wants to merge 6 commits intomainfrom
chore/6993-historical-comparison

Conversation

@DDDDDanica
Copy link
Contributor

@DDDDDanica DDDDDanica commented Feb 19, 2026

Description

The Performance Benchmarks section of the MM benchmark PR comment currently has no historical comparison and shows raw numbers with no baseline. The Dapp Page Load benchmark already has a working pattern for this: results are pushed to MetaMask/extension_benchmark_stats on every merge, and the PR comment fetches and displays a comparison against the historical mean.

This PR applies the same historical data pattern to all Performance Benchmarks (Startup, Interaction, and User Journey).

Data collection pipeline:

  • Extends benchmark-stats-commit.sh with a ui-startup mode that assembles benchmark results from all platform matrix jobs into a structured JSON payload and commits it to extension_benchmark_stats
  • Stores data per-branch (stats/release-12.x/, stats/main/) so PRs compare against the correct release branch baseline
  • Startup presets (startupStandardHome, startupPowerUserHome) are stored for all browser/buildType combinations under a pageLoad group; interaction (interactionUserActions) and user journey (userJourney*) presets store chrome-browserify only
  • Consolidates all benchmark stats commits (Dapp page load + Performance Benchmarks) into a single store-benchmark-stats job in main.yml, fixing a pre-existing issue where EXTENSION_BENCHMARK_STATS_TOKEN was never wired to the jobs that needed it

Historical fetch module (historical-comparison.ts):

  • Fetches ui_startup_data.json for the PR's target branch (GITHUB_BASE_REF)
  • When a branch has no data yet (e.g. newly cut release branch), automatically falls back to the most recently populated release branch by semver order
  • Produces a HistoricalMeanReference map (benchmark → metric → mean) ready to be consumed by the traffic light / regression detection layer in a follow-up PR

Open in GitHub Codespaces

Changelog

CHANGELOG entry: null

Related issues

Fixes: https://github.com/MetaMask/MetaMask-planning/issues/6993

Manual testing steps

  1. Go to this page...

Screenshots/Recordings

Before

After

Pre-merge author checklist

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed).
  • I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots.

Note

Medium Risk
Modifies CI benchmark workflows and the script that commits results to an external stats repo, so misconfiguration could stop benchmarks from being recorded or break PR benchmark comments, but it does not affect production runtime behavior.

Overview
Adds a per-branch benchmark stats publishing pipeline to MetaMask/extension_benchmark_stats. The updated benchmark-stats-commit.sh now supports dapp-page-load and performance modes, writes to stats/<sanitized-branch>/page_load_data.json and performance_data.json, and aggregates multiple benchmark JSON artifacts (keeping page-load presets across all browser/build combos while storing other presets only for canonical chrome-browserify).

Updates GitHub Actions to rename the dapp page-load workflow/artifact/output file, upload all benchmark JSONs as short-lived artifacts, and introduce a store-benchmark-stats job on main/release/* pushes that downloads artifacts and runs the commit script twice (performance + dapp page-load). Also adds a new TS module + tests for fetching/aggregating historical performance data with release-branch fallback logic, and centralizes benchmark result types in shared/constants/benchmarks (updating imports accordingly).

Written by Cursor Bugbot for commit 89c199c. This will update automatically on new commits. Configure here.

@DDDDDanica DDDDDanica self-assigned this Feb 19, 2026
@DDDDDanica DDDDDanica added the team-extension-platform Extension Platform team label Feb 19, 2026
@github-actions
Copy link
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@@ -17,9 +17,6 @@ on:
required: true
type: string
description: The run ID to get builds from
secrets:
EXTENSION_BENCHMARK_STATS_TOKEN:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

page-load-benchmark.yml no longer touches the stats repo at all. The token is now used directly in the store-benchmark-stats job inside main.yml, which is a regular job (not a reusable workflow) and has direct access to secrets.* without any declaration needed.

@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from c463bb8 to ac1a803 Compare February 19, 2026 14:26
@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from ac1a803 to 1ef3159 Compare February 20, 2026 13:19
@DDDDDanica DDDDDanica changed the base branch from chore/6993-announcement to main February 20, 2026 13:33
@metamaskbot metamaskbot added the INVALID-PR-TEMPLATE PR's body doesn't match template label Feb 20, 2026
@github-actions github-actions bot added size-XL and removed size-L labels Feb 24, 2026
@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from bfa8232 to fc7cbbb Compare February 24, 2026 15:28
@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from fc7cbbb to e27615d Compare February 24, 2026 15:32
@DDDDDanica DDDDDanica changed the base branch from main to fix/6993-announcement-comments February 24, 2026 15:34
@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from e27615d to 760d8fe Compare February 24, 2026 15:45
@socket-security
Copy link

socket-security bot commented Feb 24, 2026

No dependency changes detected. Learn more about Socket for GitHub.

👍 No dependency changes detected in pull request

@DDDDDanica DDDDDanica changed the base branch from fix/6993-announcement-comments to main February 24, 2026 16:05
@DDDDDanica DDDDDanica changed the base branch from main to fix/6993-announcement-comments February 25, 2026 02:14
@DDDDDanica DDDDDanica changed the base branch from fix/6993-announcement-comments to main February 25, 2026 18:33
@DDDDDanica DDDDDanica force-pushed the chore/6993-historical-comparison branch from 4c7eb1d to 03a1e2b Compare February 27, 2026 05:20
@metamaskbotv2
Copy link
Contributor

metamaskbotv2 bot commented Feb 27, 2026

✨ Files requiring CODEOWNER review ✨

🧪 @MetaMask/qa (1 files, +0 -0)
  • 📁 test/
    • 📁 e2e/
      • 📁 page-objects/
        • 📁 benchmark/
          • 📄 dapp-page-load-benchmark.ts

👨‍🔧 @MetaMask/wallet-integrations (3 files, +3 -3)
  • 📁 test/
    • 📁 e2e/
      • 📁 page-objects/
        • 📁 benchmark/
          • 📄 dapp-page-load-benchmark.ts
      • 📁 playwright/
        • 📁 benchmark/
          • 📄 dapp-page-load-benchmark.spec.ts +2 -2
          • 📄 README.md +1 -1

@DDDDDanica DDDDDanica marked this pull request as ready for review February 27, 2026 05:21
@DDDDDanica DDDDDanica requested a review from a team as a code owner February 27, 2026 05:21
@github-actions github-actions bot added size-XL and removed size-L labels Feb 27, 2026
@DDDDDanica DDDDDanica added needs-qa Label will automate into QA workspace and removed INVALID-PR-TEMPLATE PR's body doesn't match template labels Feb 27, 2026
@metamaskbot metamaskbot added the INVALID-PR-TEMPLATE PR's body doesn't match template label Feb 27, 2026
console.log(
`[DEBUG] Fetched commits: ${Object.keys(currentData).join(', ')}`,
);
console.log(`[DEBUG] Raw data: ${JSON.stringify(currentData, null, 2)}`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug logging dumps entire historical data files

Low Severity

Several [DEBUG]-prefixed console.log statements dump the entire performance_data.json contents via JSON.stringify(currentData, null, 2) and log every individual metric collected. As the stats repo accumulates commits, these dumps will produce extremely large CI log output, making logs harder to read and potentially slowing down the PR comment generation step. The [DEBUG] prefix indicates these were temporary aids, not production logging.

Additional Locations (2)

Fix in Cursor Fix in Web

@DDDDDanica DDDDDanica removed the INVALID-PR-TEMPLATE PR's body doesn't match template label Feb 27, 2026
@metamaskbot metamaskbot added the INVALID-PR-TEMPLATE PR's body doesn't match template label Feb 27, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

if [[ ${file_count} -eq 0 ]]; then
echo "No benchmark files found in ${results_dir}, skipping." >&2
exit 0
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subshell exit 0 doesn't stop main script execution

High Severity

The exit 0 inside assemble_performance_data (line 131) only exits the command substitution subshell at line 160 (COMMIT_DATA=$(assemble_performance_data)), not the main script. When no benchmark files are found, COMMIT_DATA is set to an empty string, the script continues past the clone/checkout steps, and then jq --argjson data "" at line 212 fails because an empty string is not valid JSON. This causes an unclean error after git operations instead of the intended graceful skip, and the clone directory is never cleaned up.

Additional Locations (2)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

INVALID-PR-TEMPLATE PR's body doesn't match template needs-qa Label will automate into QA workspace size-XL team-extension-platform Extension Platform team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants