chore(6993): Add Historical Comparison to UI Startup Metrics#40237
chore(6993): Add Historical Comparison to UI Startup Metrics#40237DDDDDanica wants to merge 6 commits intomainfrom
Conversation
|
CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes. |
| @@ -17,9 +17,6 @@ on: | |||
| required: true | |||
| type: string | |||
| description: The run ID to get builds from | |||
| secrets: | |||
| EXTENSION_BENCHMARK_STATS_TOKEN: | |||
There was a problem hiding this comment.
page-load-benchmark.yml no longer touches the stats repo at all. The token is now used directly in the store-benchmark-stats job inside main.yml, which is a regular job (not a reusable workflow) and has direct access to secrets.* without any declaration needed.
c463bb8 to
ac1a803
Compare
ac1a803 to
1ef3159
Compare
bfa8232 to
fc7cbbb
Compare
fc7cbbb to
e27615d
Compare
e27615d to
760d8fe
Compare
|
No dependency changes detected. Learn more about Socket for GitHub. 👍 No dependency changes detected in pull request |
…sections Made-with: Cursor
c62b03c to
f787f91
Compare
4c7eb1d to
03a1e2b
Compare
✨ Files requiring CODEOWNER review ✨🧪 @MetaMask/qa (1 files, +0 -0)
👨🔧 @MetaMask/wallet-integrations (3 files, +3 -3)
|
| console.log( | ||
| `[DEBUG] Fetched commits: ${Object.keys(currentData).join(', ')}`, | ||
| ); | ||
| console.log(`[DEBUG] Raw data: ${JSON.stringify(currentData, null, 2)}`); |
There was a problem hiding this comment.
Debug logging dumps entire historical data files
Low Severity
Several [DEBUG]-prefixed console.log statements dump the entire performance_data.json contents via JSON.stringify(currentData, null, 2) and log every individual metric collected. As the stats repo accumulates commits, these dumps will produce extremely large CI log output, making logs harder to read and potentially slowing down the PR comment generation step. The [DEBUG] prefix indicates these were temporary aids, not production logging.
Additional Locations (2)
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| if [[ ${file_count} -eq 0 ]]; then | ||
| echo "No benchmark files found in ${results_dir}, skipping." >&2 | ||
| exit 0 | ||
| fi |
There was a problem hiding this comment.
Subshell exit 0 doesn't stop main script execution
High Severity
The exit 0 inside assemble_performance_data (line 131) only exits the command substitution subshell at line 160 (COMMIT_DATA=$(assemble_performance_data)), not the main script. When no benchmark files are found, COMMIT_DATA is set to an empty string, the script continues past the clone/checkout steps, and then jq --argjson data "" at line 212 fails because an empty string is not valid JSON. This causes an unclean error after git operations instead of the intended graceful skip, and the clone directory is never cleaned up.


Description
The Performance Benchmarks section of the MM benchmark PR comment currently has no historical comparison and shows raw numbers with no baseline. The Dapp Page Load benchmark already has a working pattern for this: results are pushed to
MetaMask/extension_benchmark_statson every merge, and the PR comment fetches and displays a comparison against the historical mean.This PR applies the same historical data pattern to all Performance Benchmarks (Startup, Interaction, and User Journey).
Data collection pipeline:
benchmark-stats-commit.shwith aui-startupmode that assembles benchmark results from all platform matrix jobs into a structured JSON payload and commits it toextension_benchmark_statsstats/release-12.x/,stats/main/) so PRs compare against the correct release branch baselinestartupStandardHome,startupPowerUserHome) are stored for all browser/buildType combinations under apageLoadgroup; interaction (interactionUserActions) and user journey (userJourney*) presets store chrome-browserify onlystore-benchmark-statsjob inmain.yml, fixing a pre-existing issue whereEXTENSION_BENCHMARK_STATS_TOKENwas never wired to the jobs that needed itHistorical fetch module (
historical-comparison.ts):ui_startup_data.jsonfor the PR's target branch (GITHUB_BASE_REF)HistoricalMeanReferencemap (benchmark → metric → mean) ready to be consumed by the traffic light / regression detection layer in a follow-up PRChangelog
CHANGELOG entry: null
Related issues
Fixes: https://github.com/MetaMask/MetaMask-planning/issues/6993
Manual testing steps
Screenshots/Recordings
Before
After
Pre-merge author checklist
Pre-merge reviewer checklist
Note
Medium Risk
Modifies CI benchmark workflows and the script that commits results to an external stats repo, so misconfiguration could stop benchmarks from being recorded or break PR benchmark comments, but it does not affect production runtime behavior.
Overview
Adds a per-branch benchmark stats publishing pipeline to
MetaMask/extension_benchmark_stats. The updatedbenchmark-stats-commit.shnow supportsdapp-page-loadandperformancemodes, writes tostats/<sanitized-branch>/page_load_data.jsonandperformance_data.json, and aggregates multiple benchmark JSON artifacts (keeping page-load presets across all browser/build combos while storing other presets only for canonicalchrome-browserify).Updates GitHub Actions to rename the dapp page-load workflow/artifact/output file, upload all benchmark JSONs as short-lived artifacts, and introduce a
store-benchmark-statsjob onmain/release/*pushes that downloads artifacts and runs the commit script twice (performance + dapp page-load). Also adds a new TS module + tests for fetching/aggregating historical performance data with release-branch fallback logic, and centralizes benchmark result types inshared/constants/benchmarks(updating imports accordingly).Written by Cursor Bugbot for commit 89c199c. This will update automatically on new commits. Configure here.