Add CI performance budgets and canary metrics checks#123
Add CI performance budgets and canary metrics checks#123shayancoin wants to merge 3 commits intomainfrom
Conversation
|
Warning Rate limit exceeded@shayancoin has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 5 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds CI jobs for performance budgeting and canary metrics, a Playwright-based frontend perf runner with a new .perf-budget.yml, a Python canary metrics checker with fixture fallback, docs for release checks, and related packaging and ignore updates. Changes
Sequence Diagram(s)sequenceDiagram
actor CI
participant PerfJob as perf-budgets Job
participant Stack as Stack (API & Frontend)
participant Runner as run-perf-budget.ts
participant Artifacts as Artifact Storage
CI->>PerfJob: Trigger (after backend/frontend tests)
PerfJob->>Stack: Start services
PerfJob->>Stack: Seed backend data
PerfJob->>PerfJob: Wait for readiness
note right of Runner#lightblue: Load .perf-budget.yml
loop scenarios
PerfJob->>Runner: Execute scenario
activate Runner
Runner->>Stack: Navigate / perform steps
note right of Runner#lightgreen: Apply slow-4g throttling\nInject PerformanceObserver
Runner->>Runner: Collect metrics & aggregate
deactivate Runner
end
alt budgets pass
Runner->>Artifacts: Write JUnit + JSON (passed)
Artifacts-->>CI: Success
else violation
Runner->>Artifacts: Write JUnit + JSON (failed)
Artifacts-->>CI: Failure (non-zero)
end
PerfJob->>Stack: Shutdown
sequenceDiagram
actor CI
participant CanaryJob as canary-metrics Job
participant Script as check_canary_metrics.py
participant Prom as Prometheus
participant Tempo as Tempo
participant Artifacts as Artifact Storage
CI->>CanaryJob: Trigger (after perf-budgets)
CanaryJob->>Script: Invoke (env vars)
alt Prom/Tempo reachable
Script->>Prom: Query P95 latency & error rate
Script->>Tempo: Query trace latency P95
Prom-->>Script: metrics
Tempo-->>Script: trace metrics
else fallback
Script->>Script: Load fixture JSON
end
Script->>Script: Evaluate thresholds & regressions
alt checks pass
Script->>Artifacts: Write summary (passed=true)
Artifacts-->>CI: Success
else any failure
Script->>Artifacts: Write summary (passed=false)
Artifacts-->>CI: Failure (non-zero)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting
| def _collect_metrics_from_services() -> Optional[CanaryMetrics]: | ||
| prom_url = os.environ.get("PROMETHEUS_URL") | ||
| latency_query = os.environ.get("PROMETHEUS_LATENCY_QUERY") | ||
| error_query = os.environ.get("PROMETHEUS_ERROR_QUERY") | ||
| previous_latency_query = os.environ.get("PROMETHEUS_PREVIOUS_LATENCY_QUERY") | ||
| previous_error_query = os.environ.get("PROMETHEUS_PREVIOUS_ERROR_QUERY") | ||
|
|
||
| if not (prom_url and latency_query and error_query): | ||
| return None | ||
|
|
||
| latency = _query_prometheus(prom_url, latency_query) | ||
| error_rate = _query_prometheus(prom_url, error_query) | ||
| previous_latency = ( | ||
| _query_prometheus(prom_url, previous_latency_query) | ||
| if previous_latency_query | ||
| else None | ||
| ) | ||
| previous_error = ( | ||
| _query_prometheus(prom_url, previous_error_query) | ||
| if previous_error_query | ||
| else None | ||
| ) | ||
|
|
||
| tempo_url = os.environ.get("TEMPO_URL") | ||
| trace_latency_query = os.environ.get("TEMPO_TRACE_QUERY") | ||
| trace_latency = None | ||
| if tempo_url and trace_latency_query: | ||
| try: | ||
| payload = _http_get_json( | ||
| f"{tempo_url.rstrip('/')}/api/search", {"q": trace_latency_query} | ||
| ) | ||
| trace_latency = _extract_prom_value(payload) | ||
| except urllib.error.URLError as exc: | ||
| print(f"Failed to query Tempo: {exc}", file=sys.stderr) | ||
|
|
||
| build_tag = os.environ.get("BUILD_TAG") or os.environ.get("GITHUB_SHA") | ||
| previous_build = os.environ.get("PREVIOUS_BUILD_TAG") | ||
|
|
||
| return CanaryMetrics( | ||
| latency_p95_ms=float(latency) if latency is not None else 0.0, | ||
| error_rate=float(error_rate) if error_rate is not None else 0.0, | ||
| trace_latency_p95_ms=trace_latency, |
There was a problem hiding this comment.
Fail canary check when Prometheus data is unavailable
When a Prometheus/Tempo query fails or returns no samples, _query_prometheus returns None, yet _collect_metrics_from_services converts those None values to 0.0 (latency_p95_ms and error_rate). _evaluate treats those zeros as being within budget, so network outages or misconfigured queries will silently mark the canary job as passing without validating real metrics. The check should raise an error or fall back to the fixture when no data is retrieved instead of assuming zero latency/error.
Useful? React with 👍 / 👎.
scripts/ci/check_canary_metrics.py
Outdated
| print( | ||
| "Canary metrics within thresholds. " | ||
| f"latency_p95_ms={metrics.latency_p95_ms:.2f}, " | ||
| f"error_rate={metrics.error_rate:.4f}" | ||
| ) | ||
| if metrics.trace_latency_p95_ms is not None: | ||
| print(f"Tempo trace P95 latency: {metrics.trace_latency_p95_ms:.2f}ms") | ||
| if metrics.previous_latency_p95_ms is not None: | ||
| print( | ||
| "Compared to previous build: " | ||
| f"latency_p95_ms={metrics.previous_latency_p95_ms:.2f}, " | ||
| f"error_rate={metrics.previous_error_rate:.4f}" |
There was a problem hiding this comment.
Guard printing previous error rate when it is absent
The success log formats metrics.previous_error_rate with :.4f whenever previous_latency_p95_ms is set, but it does not first verify that the error rate is also present. If only PROMETHEUS_PREVIOUS_LATENCY_QUERY is configured (common when tracking latency regressions but not errors), previous_error_rate remains None and this formatting raises TypeError, causing the script to exit after the metrics check already passed. Check both values before formatting or default the missing one.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 7
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
frontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (10)
.github/workflows/ci.yml(1 hunks).gitignore(1 hunks).perf-budget.yml(1 hunks)docs/release-checklist.md(1 hunks)frontend/package.json(3 hunks)frontend/tests/perf/run-perf-budget.ts(1 hunks)mkdocs.yml(1 hunks)perf-budget.yml(0 hunks)scripts/ci/check_canary_metrics.py(1 hunks)tests/perf/canary-metrics.fixture.json(1 hunks)
💤 Files with no reviewable changes (1)
- perf-budget.yml
🧰 Additional context used
🧬 Code graph analysis (3)
.github/workflows/ci.yml (1)
tests/perf/k6-quote-cnc.js (1)
data(85-128)
scripts/ci/check_canary_metrics.py (1)
backend/api/main.py (1)
metrics(105-107)
frontend/tests/perf/run-perf-budget.ts (2)
tests/perf/k6-quote-cnc.js (2)
file(107-107)data(85-128)backend/api/main.py (1)
metrics(105-107)
🪛 actionlint (1.7.8)
.github/workflows/ci.yml
146-146: could not parse as YAML: could not find expected ':'
(syntax-check)
🪛 LanguageTool
docs/release-checklist.md
[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist This checklist connects the CI performan...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~4-~4: There might be a mistake here.
Context: ...ion releases. It is intended for release managers and on-call engineers who need ...
(QB_NEW_EN)
[grammar] ~5-~5: There might be a mistake here.
Context: ...ble flow that ties Grafana alerts, Tempo traces, and CI preview environments toge...
(QB_NEW_EN)
[grammar] ~6-~6: Use correct spacing
Context: ...s, and CI preview environments together. ## 1. Validate CI performance budgets 1. C...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~8-~8: There might be a mistake here.
Context: .... ## 1. Validate CI performance budgets 1. Confirm the Performance Budgets job ...
(QB_NEW_EN_OTHER)
[grammar] ~12-~12: Use correct spacing
Context: ...son` for the concrete values captured by Playwright (P75 configurator load, P90 n...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~13-~13: There might be a mistake here.
Context: ...load, P90 navigation duration, P95 LCP). 2. If any threshold failed, investigate bef...
(QB_NEW_EN)
[grammar] ~14-~14: There might be a mistake here.
Context: ... before promoting the release candidate: * Re-run the job against the preview envir...
(QB_NEW_EN)
[grammar] ~15-~15: Use correct spacing
Context: ...ment to determine whether the regression is deterministic or environment-specific...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~16-~16: There might be a mistake here.
Context: ...s deterministic or environment-specific. * Capture a Grafana dashboard snapshot sho...
(QB_NEW_EN)
[grammar] ~17-~17: Use correct spacing
Context: ...g the relevant Web Vitals panel and link it in the incident tracker. ## 2. Check...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~18-~18: Use correct spacing
Context: ...nd link it in the incident tracker. ## 2. Check canary latency and error regres...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~20-~20: There might be a mistake here.
Context: ...eck canary latency and error regressions 1. Verify the Canary Metrics job comple...
(QB_NEW_EN_OTHER)
[grammar] ~23-~23: Use correct spacing
Context: ...json` to compare the current build's P95 latency and error rate with the previous...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~24-~24: There might be a mistake here.
Context: ...evious build and the configured budgets. 3. If the job failed: * Acknowledge or s...
(QB_NEW_EN)
[grammar] ~25-~25: Use colons correctly
Context: ...d the configured budgets. 3. If the job failed: * Acknowledge or silence the corresponding Grafana al...
(QB_NEW_EN_OTHER_ERROR_IDS_30)
[grammar] ~26-~26: Use correct spacing
Context: ...ding Grafana alert for the service-level objective (SLO). * Escalate to the on...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~27-~27: There might be a mistake here.
Context: ... the service-level objective (SLO). * Escalate to the on-call engineer through...
(QB_NEW_EN)
[grammar] ~28-~28: Use correct spacing
Context: ...he paging integration configured for the Grafana alert rule (OpsGenie, PagerDuty,...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~29-~29: There might be a mistake here.
Context: ... alert rule (OpsGenie, PagerDuty, etc.). * Use the Tempo trace search link emitted ...
(QB_NEW_EN)
[grammar] ~30-~30: Use correct spacing
Context: ... the job (or Grafana Explore) to confirm whether elevated latency correlates with...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~31-~31: Use correct spacing
Context: ... latency correlates with specific spans. ## 3. Correlate CI results with Grafana das...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~33-~33: There might be a mistake here.
Context: ...elate CI results with Grafana dashboards 1. Publish the artifacts from the CI run (`...
(QB_NEW_EN_OTHER)
[grammar] ~35-~35: Use correct spacing
Context: ...e CI run (perf-budget-summary.json and canary-metrics-summary.json) to the sh...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~36-~36: There might be a mistake here.
Context: ...to the shared release channel or ticket. 2. Update the Grafana dashboard annotations...
(QB_NEW_EN)
[grammar] ~37-~37: Use correct spacing
Context: ...the build ID and a link to the CI run so that on-call engineers can quickly corre...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~38-~38: There might be a mistake here.
Context: ...elate spikes with the release candidate. 3. Ensure the Grafana dashboard panels for ...
(QB_NEW_EN)
[grammar] ~39-~39: Use correct spacing
Context: ...d Budgets" and "Runtime Latency" use the Pushgateway/JUnit exports for side-by-si...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~40-~40: Use correct spacing
Context: ...it exports for side-by-side comparisons. ## 4. Preview gating and release sign-off ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~42-~42: Use correct spacing
Context: ...# 4. Preview gating and release sign-off 1. Block the promotion of the preview envir...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~44-~44: Use correct spacing
Context: ...ronment to staging/production until both performance-related jobs have succeeded ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~45-~45: There might be a mistake here.
Context: ...succeeded for the release branch commit. 2. Confirm no active Grafana alerts remain ...
(QB_NEW_EN)
[grammar] ~46-~46: Use correct spacing
Context: ...for the release window. If alerts exist, document mitigations and obtain sign-off...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~47-~47: There might be a mistake here.
Context: ...he incident commander before proceeding. 3. Log the final decision in the release re...
(QB_NEW_EN)
[grammar] ~48-~48: There might be a mistake here.
Context: ...decision in the release record, linking to: * The CI run with passing performance jobs. ...
(QB_NEW_EN_OTHER)
[grammar] ~49-~49: There might be a mistake here.
Context: ...he CI run with passing performance jobs. * Grafana alert history or dashboards show...
(QB_NEW_EN)
[grammar] ~51-~51: Use correct spacing
Context: ...at justify the decision when applicable. Following this checklist guarantees that...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
🪛 Ruff (0.14.0)
scripts/ci/check_canary_metrics.py
1-1: Shebang is present but file is not executable
(EXE001)
13-13: typing.Dict is deprecated, use dict instead
(UP035)
32-32: Dynamically typed expressions (typing.Any) are disallowed in value
(ANN401)
41-41: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
74-74: Unnecessary mode argument
Remove mode argument
(UP015)
152-152: Boolean-typed positional argument in function definition
(FBT001)
🪛 YAMLlint (1.37.1)
.github/workflows/ci.yml
[error] 147-147: syntax error: could not find expected ':'
(syntax)
🔇 Additional comments (2)
.perf-budget.yml (1)
1-66: Perf budget config looks consistent with the runnerDefaults, throttling, waits, steps, and metric IDs match the runner. Thresholds and aggregations are valid.
Optional:
- If you want per-scenario throttling overrides, extend ScenarioConfig to include throttling and pass it to applyThrottling.
- Consider adding FCP to homepage-to-configurator for additional coverage.
docs/release-checklist.md (1)
1-54: Clear, actionable checklist aligned with CI artifactsMatches job names and artifact paths. No blockers.
| - name: Set up Node.js | ||
| uses: actions/setup-node@0a44ba78451273a1ed8ac2fee4e347c72dfd377f | ||
| with: | ||
| node-version: '20' | ||
| cache: 'npm' | ||
| cache-dependency-path: ./frontend/package-lock.json | ||
|
|
||
| - name: Install dependencies | ||
| working-directory: ./frontend | ||
| run: npm ci | ||
|
|
||
| - name: Install Playwright browsers | ||
| working-directory: ./frontend | ||
| run: npx playwright install --with-deps chromium | ||
|
|
||
| - name: Start stack | ||
| run: docker compose --env-file .env.development -f docker-compose.dev.yml up -d --build |
There was a problem hiding this comment.
Install Python before using it in perf-budgets job
The job invokes python to seed the backend but never sets up Python. On hosted runners, python may not be available as python (often python3 only), causing a hard failure.
Apply this diff to add setup-python (mirrors earlier jobs):
- name: Set up Node.js
uses: actions/setup-node@0a44ba78451273a1ed8ac2fee4e347c72dfd377f
with:
node-version: '20'
cache: 'npm'
cache-dependency-path: ./frontend/package-lock.json
+
+ - name: Set up Python
+ uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c
+ with:
+ python-version: '3.12'🤖 Prompt for AI Agents
.github/workflows/ci.yml around lines 115 to 131: the perf-budgets job runs
Python to seed the backend but never installs or configures Python, which can
fail on hosted runners where only python3 is available; add a step before any
python invocation that uses actions/setup-python (e.g. actions/setup-python@v4)
with python-version: '3.x' (or a specific 3.11/3.10 as used elsewhere) so the
runner exposes the python binary, and mirror any cache/config from earlier jobs
if needed.
| - name: Wait for API | ||
| run: | | ||
| for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done | ||
|
|
||
| - name: Wait for Frontend | ||
| run: | | ||
| for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done | ||
|
|
There was a problem hiding this comment.
Make readiness loops fail if services never become ready
Current for-loops can exit successfully after the final sleep even if curl never succeeds, letting the job proceed in a bad state.
Apply this diff to both API and Frontend waits:
- - name: Wait for API
- run: |
- for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done
+ - name: Wait for API
+ run: |
+ for i in {1..60}; do
+ if curl -sf http://localhost:8000/healthcheck; then
+ echo "API ready"
+ exit 0
+ fi
+ sleep 2
+ done
+ echo "API failed to become ready in time" >&2
+ exit 1
@@
- - name: Wait for Frontend
- run: |
- for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done
+ - name: Wait for Frontend
+ run: |
+ for i in {1..60}; do
+ if curl -sf http://localhost:3000/models/manifest.json; then
+ echo "Frontend ready"
+ exit 0
+ fi
+ sleep 2
+ done
+ echo "Frontend failed to become ready in time" >&2
+ exit 1🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 133 to 140, the readiness for-loops
currently can complete quietly even if curl never succeeded; change each block
so the workflow fails when the service never becomes ready by adding an explicit
failure after the loop (e.g., run the curl check once more and if it still fails
echo an error and exit 1, or set a flag inside the loop and after the loop test
it and exit 1 on failure), applying this to both the API and Frontend wait
steps.
| python - <<'PY' | ||
| import json | ||
| import os | ||
| import urllib.error | ||
| import urllib.request | ||
|
|
||
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | ||
|
|
||
| def post(path: str, payload: dict) -> dict: | ||
| req = urllib.request.Request( | ||
| f"{BASE_URL}{path}", | ||
| data=json.dumps(payload).encode("utf-8"), | ||
| headers={"Content-Type": "application/json"}, | ||
| ) | ||
| try: | ||
| with urllib.request.urlopen(req, timeout=10) as resp: | ||
| return json.loads(resp.read().decode("utf-8")) | ||
| except urllib.error.HTTPError as exc: | ||
| detail = exc.read().decode("utf-8", "ignore") | ||
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | ||
|
|
||
| material = post( | ||
| "/api/materials/", | ||
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | ||
| ) | ||
| material_id = material.get("id") | ||
| if not material_id: | ||
| raise SystemExit("Material creation failed; missing id") | ||
|
|
||
| post( | ||
| "/api/modules/", | ||
| { | ||
| "name": "Base600", | ||
| "width": 600.0, | ||
| "height": 720.0, | ||
| "depth": 580.0, | ||
| "base_price": 100.0, | ||
| "material_id": material_id, | ||
| }, | ||
| ) | ||
| PY |
There was a problem hiding this comment.
Fix YAML block: indent heredoc body consistently (current YAML parse error)
Lines inside the run: | block after “python - <<'PY'” are not indented, so YAML treats them as new keys. Indent the entire heredoc body consistently to satisfy YAML while preserving the script.
Apply this diff:
- name: Seed backend for perf
env:
BASE_URL: http://localhost:8000
run: |
- python - <<'PY'
-import json
-import os
-import urllib.error
-import urllib.request
-
-BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
-
-def post(path: str, payload: dict) -> dict:
- req = urllib.request.Request(
- f"{BASE_URL}{path}",
- data=json.dumps(payload).encode("utf-8"),
- headers={"Content-Type": "application/json"},
- )
- try:
- with urllib.request.urlopen(req, timeout=10) as resp:
- return json.loads(resp.read().decode("utf-8"))
- except urllib.error.HTTPError as exc:
- detail = exc.read().decode("utf-8", "ignore")
- raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
-
-material = post(
- "/api/materials/",
- {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
-)
-material_id = material.get("id")
-if not material_id:
- raise SystemExit("Material creation failed; missing id")
-
-post(
- "/api/modules/",
- {
- "name": "Base600",
- "width": 600.0,
- "height": 720.0,
- "depth": 580.0,
- "base_price": 100.0,
- "material_id": material_id,
- },
-)
-PY
+ python - <<'PY'
+ import json
+ import os
+ import urllib.error
+ import urllib.request
+
+ BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
+
+ def post(path: str, payload: dict) -> dict:
+ req = urllib.request.Request(
+ f"{BASE_URL}{path}",
+ data=json.dumps(payload).encode("utf-8"),
+ headers={"Content-Type": "application/json"},
+ )
+ try:
+ with urllib.request.urlopen(req, timeout=10) as resp:
+ return json.loads(resp.read().decode("utf-8"))
+ except urllib.error.HTTPError as exc:
+ detail = exc.read().decode("utf-8", "ignore")
+ raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
+
+ material = post(
+ "/api/materials/",
+ {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
+ )
+ material_id = material.get("id")
+ if not material_id:
+ raise SystemExit("Material creation failed; missing id")
+
+ post(
+ "/api/modules/",
+ {
+ "name": "Base600",
+ "width": 600.0,
+ "height": 720.0,
+ "depth": 580.0,
+ "base_price": 100.0,
+ "material_id": material_id,
+ },
+ )
+ PYNote: YAML will strip the common indent from the block scalar, so the shell sees the heredoc exactly as intended. The current error matches actionlint/YAMLlint output.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python - <<'PY' | |
| import json | |
| import os | |
| import urllib.error | |
| import urllib.request | |
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | |
| def post(path: str, payload: dict) -> dict: | |
| req = urllib.request.Request( | |
| f"{BASE_URL}{path}", | |
| data=json.dumps(payload).encode("utf-8"), | |
| headers={"Content-Type": "application/json"}, | |
| ) | |
| try: | |
| with urllib.request.urlopen(req, timeout=10) as resp: | |
| return json.loads(resp.read().decode("utf-8")) | |
| except urllib.error.HTTPError as exc: | |
| detail = exc.read().decode("utf-8", "ignore") | |
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | |
| material = post( | |
| "/api/materials/", | |
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | |
| ) | |
| material_id = material.get("id") | |
| if not material_id: | |
| raise SystemExit("Material creation failed; missing id") | |
| post( | |
| "/api/modules/", | |
| { | |
| "name": "Base600", | |
| "width": 600.0, | |
| "height": 720.0, | |
| "depth": 580.0, | |
| "base_price": 100.0, | |
| "material_id": material_id, | |
| }, | |
| ) | |
| PY | |
| - name: Seed backend for perf | |
| env: | |
| BASE_URL: http://localhost:8000 | |
| run: | | |
| python - <<'PY' | |
| import json | |
| import os | |
| import urllib.error | |
| import urllib.request | |
| BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") | |
| def post(path: str, payload: dict) -> dict: | |
| req = urllib.request.Request( | |
| f"{BASE_URL}{path}", | |
| data=json.dumps(payload).encode("utf-8"), | |
| headers={"Content-Type": "application/json"}, | |
| ) | |
| try: | |
| with urllib.request.urlopen(req, timeout=10) as resp: | |
| return json.loads(resp.read().decode("utf-8")) | |
| except urllib.error.HTTPError as exc: | |
| detail = exc.read().decode("utf-8", "ignore") | |
| raise SystemExit(f"Seed request failed ({exc.code}): {detail}") | |
| material = post( | |
| "/api/materials/", | |
| {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, | |
| ) | |
| material_id = material.get("id") | |
| if not material_id: | |
| raise SystemExit("Material creation failed; missing id") | |
| post( | |
| "/api/modules/", | |
| { | |
| "name": "Base600", | |
| "width": 600.0, | |
| "height": 720.0, | |
| "depth": 580.0, | |
| "base_price": 100.0, | |
| "material_id": material_id, | |
| }, | |
| ) | |
| PY |
🧰 Tools
🪛 actionlint (1.7.8)
146-146: could not parse as YAML: could not find expected ':'
(syntax-check)
🪛 YAMLlint (1.37.1)
[error] 147-147: syntax error: could not find expected ':'
(syntax)
🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 145 to 185 the python heredoc started
with "python - <<'PY'" is not indented, causing YAML to treat the script lines
as top-level keys; indent the entire heredoc body (every line between the
"python - <<'PY'" line and the closing "PY") to the same indentation level as
the run block content (e.g., two extra spaces) so YAML parses the block scalar
correctly, and keep the closing "PY" delimiter indented to that same level; do
not change the script content otherwise (YAML will strip common indentation when
passing the heredoc to the shell).
| canary-metrics: | ||
| runs-on: ubuntu-latest | ||
| needs: | ||
| - perf-budgets | ||
| steps: | ||
| - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 | ||
|
|
||
| - name: Evaluate canary metrics | ||
| env: | ||
| CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json | ||
| P95_THRESHOLD_MS: '3000' | ||
| ERROR_RATE_THRESHOLD: '0.02' | ||
| REGRESSION_TOLERANCE_PCT: '0.1' | ||
| run: python scripts/ci/check_canary_metrics.py | ||
|
|
There was a problem hiding this comment.
Install Python before running canary metrics script
This job runs python scripts/ci/check_canary_metrics.py without provisioning Python. Add setup-python or invoke python3 only if guaranteed to exist.
Apply this diff:
canary-metrics:
runs-on: ubuntu-latest
needs:
- perf-budgets
steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8
+
+ - name: Set up Python
+ uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c
+ with:
+ python-version: '3.12'
@@
- - name: Evaluate canary metrics
+ - name: Evaluate canary metrics
env:
CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json
P95_THRESHOLD_MS: '3000'
ERROR_RATE_THRESHOLD: '0.02'
REGRESSION_TOLERANCE_PCT: '0.1'
run: python scripts/ci/check_canary_metrics.py📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| canary-metrics: | |
| runs-on: ubuntu-latest | |
| needs: | |
| - perf-budgets | |
| steps: | |
| - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 | |
| - name: Evaluate canary metrics | |
| env: | |
| CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json | |
| P95_THRESHOLD_MS: '3000' | |
| ERROR_RATE_THRESHOLD: '0.02' | |
| REGRESSION_TOLERANCE_PCT: '0.1' | |
| run: python scripts/ci/check_canary_metrics.py | |
| canary-metrics: | |
| runs-on: ubuntu-latest | |
| needs: | |
| - perf-budgets | |
| steps: | |
| - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 | |
| - name: Set up Python | |
| uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c | |
| with: | |
| python-version: '3.12' | |
| - name: Evaluate canary metrics | |
| env: | |
| CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json | |
| P95_THRESHOLD_MS: '3000' | |
| ERROR_RATE_THRESHOLD: '0.02' | |
| REGRESSION_TOLERANCE_PCT: '0.1' | |
| run: python scripts/ci/check_canary_metrics.py |
🤖 Prompt for AI Agents
In .github/workflows/ci.yml around lines 209 to 223, the canary-metrics job
invokes python scripts/ci/check_canary_metrics.py without ensuring Python is
available; add a step before the run that sets up Python (e.g.
actions/setup-python@v4 with a python-version like '3.x' or a specific '3.11')
so the runner has a known python binary on PATH, then keep the existing run step
(or change the run to invoke python3 explicitly if you prefer not to add the
setup action).
| const rootDir = path.resolve(__dirname, '../../..'); | ||
| const configPath = path.join(rootDir, '.perf-budget.yml'); | ||
| const resultsDir = path.join(rootDir, 'perf-results'); | ||
|
|
||
| function readConfig(): PerfBudgetConfig { | ||
| const file = fs.readFileSync(configPath, 'utf-8'); | ||
| const parsed = yaml.parse(file) as PerfBudgetConfig; | ||
| if (!parsed || !Array.isArray(parsed.scenarios)) { | ||
| throw new Error('Invalid perf budget configuration: missing scenarios'); | ||
| } | ||
| return parsed; | ||
| } | ||
|
|
||
| async function applyThrottling(context: BrowserContext, page: Page, throttling?: ThrottlingConfig) { | ||
| if (!throttling) { | ||
| return; | ||
| } | ||
| const session = await context.newCDPSession(page); | ||
| if (typeof throttling.cpu_slowdown_multiplier === 'number' && throttling.cpu_slowdown_multiplier > 0) { | ||
| await session.send('Emulation.setCPUThrottlingRate', { rate: throttling.cpu_slowdown_multiplier }); | ||
| } | ||
| const download = throttling.download_throughput_kbps; | ||
| const upload = throttling.upload_throughput_kbps; | ||
| const latency = throttling.request_latency_ms; | ||
| if (download || upload || latency) { | ||
| await session.send('Network.enable'); | ||
| await session.send('Network.emulateNetworkConditions', { | ||
| offline: false, | ||
| latency: latency ?? 0, | ||
| downloadThroughput: download ? (download * 1024) / 8 : -1, | ||
| uploadThroughput: upload ? (upload * 1024) / 8 : -1, | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| async function setupPerformanceObservers(page: Page) { | ||
| await page.addInitScript(() => { | ||
| const globalAny = globalThis as any; | ||
| globalAny.__perfBudget = { | ||
| lcpEntries: [] as PerformanceEntry[], | ||
| cls: 0, | ||
| tbt: 0, | ||
| }; | ||
|
|
||
| if (globalAny.PerformanceObserver) { | ||
| try { | ||
| const lcpObserver = new PerformanceObserver((entryList) => { | ||
| const entries = entryList.getEntries(); | ||
| const store = globalAny.__perfBudget; | ||
| store.lcpEntries.push(...entries); | ||
| }); | ||
| lcpObserver.observe({ type: 'largest-contentful-paint', buffered: true }); | ||
| } catch (error) { | ||
| console.warn('LCP observer failed', error); | ||
| } | ||
|
|
||
| try { | ||
| const clsObserver = new PerformanceObserver((entryList) => { | ||
| const store = globalAny.__perfBudget; | ||
| for (const entry of entryList.getEntries() as any[]) { | ||
| if (!entry.hadRecentInput) { | ||
| store.cls += entry.value; | ||
| } | ||
| } | ||
| }); | ||
| clsObserver.observe({ type: 'layout-shift', buffered: true }); | ||
| } catch (error) { | ||
| console.warn('CLS observer failed', error); | ||
| } | ||
|
|
||
| try { | ||
| const longTaskObserver = new PerformanceObserver((entryList) => { | ||
| const store = globalAny.__perfBudget; | ||
| for (const entry of entryList.getEntries()) { | ||
| const blockingTime = entry.duration - 50; | ||
| if (blockingTime > 0) { | ||
| store.tbt += blockingTime; | ||
| } | ||
| } | ||
| }); | ||
| longTaskObserver.observe({ type: 'longtask' }); | ||
| } catch (error) { | ||
| console.warn('Long task observer failed', error); | ||
| } | ||
| } | ||
| }); | ||
| } | ||
|
|
||
| async function performWait(page: Page, wait: WaitStep) { | ||
| if (wait.type === 'selector') { | ||
| await page.waitForSelector(wait.selector, { timeout: wait.timeout_ms ?? 30000 }); | ||
| } else if (wait.type === 'networkidle') { | ||
| await page.waitForLoadState('networkidle', { timeout: wait.timeout_ms ?? 60000 }); | ||
| if (wait.idle_ms && wait.idle_ms > 0) { | ||
| await page.waitForTimeout(wait.idle_ms); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| async function performScenarioStep(page: Page, step: ScenarioStep) { | ||
| if (step.type === 'goto') { | ||
| await page.goto(step.url, { waitUntil: step.wait_until ?? 'load', timeout: 60000 }); | ||
| } else if (step.type === 'wait_for_selector') { | ||
| await page.waitForSelector(step.selector, { timeout: step.timeout_ms ?? 30000 }); | ||
| } else if (step.type === 'wait_for_timeout') { | ||
| await page.waitForTimeout(step.timeout_ms); | ||
| } | ||
| } | ||
|
|
||
| async function executeScenarioRun( | ||
| scenario: ScenarioConfig, | ||
| browser: Browser, | ||
| defaults: PerfBudgetConfig['defaults'], | ||
| ): Promise<RunMetrics> { | ||
| const context = await browser.newContext(); | ||
| const page = await context.newPage(); | ||
| await setupPerformanceObservers(page); | ||
| await applyThrottling(context, page, defaults?.throttling); | ||
|
|
||
| if (scenario.steps && scenario.steps.length > 0) { | ||
| for (const step of scenario.steps) { | ||
| await performScenarioStep(page, step); | ||
| } | ||
| } else if (scenario.url) { | ||
| await page.goto(scenario.url, { waitUntil: 'load', timeout: 60000 }); | ||
| if (scenario.waits) { | ||
| for (const wait of scenario.waits) { | ||
| await performWait(page, wait); | ||
| } | ||
| } | ||
| } else { | ||
| throw new Error(`Scenario ${scenario.id} missing url or steps`); | ||
| } | ||
|
|
||
| // Give observers time to flush buffered events | ||
| await page.waitForTimeout(1000); | ||
|
|
||
| const metrics = await page.evaluate(() => { | ||
| const navEntries = performance.getEntriesByType('navigation'); | ||
| const paints = performance.getEntriesByName('first-contentful-paint'); | ||
| const globalAny = globalThis as any; | ||
| const store = globalAny.__perfBudget || { lcpEntries: [], cls: 0, tbt: 0 }; | ||
| const lcpEntries = Array.isArray(store.lcpEntries) ? store.lcpEntries : []; | ||
| const lastNav = navEntries[navEntries.length - 1] as PerformanceNavigationTiming | undefined; | ||
| const fcp = paints.length > 0 ? paints[paints.length - 1].startTime : null; | ||
| const lcpEntry = lcpEntries.length > 0 ? lcpEntries[lcpEntries.length - 1] : null; | ||
| const lcp = lcpEntry ? (lcpEntry as any).startTime ?? (lcpEntry as any).renderTime ?? (lcpEntry as any).loadTime ?? null : null; | ||
| const cls = typeof store.cls === 'number' ? store.cls : null; | ||
| const tbt = typeof store.tbt === 'number' ? store.tbt : null; | ||
|
|
||
| return { | ||
| 'navigation-duration': lastNav ? lastNav.duration : null, | ||
| 'first-contentful-paint': fcp, | ||
| 'largest-contentful-paint': lcp, | ||
| 'total-blocking-time': tbt, | ||
| 'cumulative-layout-shift': cls, | ||
| }; | ||
| }); | ||
|
|
||
| await context.close(); | ||
| return metrics; | ||
| } | ||
|
|
||
| function collectScenarioMetrics(metrics: RunMetrics[]): ScenarioMetrics { | ||
| const result: ScenarioMetrics = {}; | ||
| for (const run of metrics) { | ||
| for (const [metricId, value] of Object.entries(run)) { | ||
| if (typeof value === 'number' && Number.isFinite(value)) { | ||
| if (!result[metricId]) { | ||
| result[metricId] = []; | ||
| } | ||
| result[metricId].push(value); | ||
| } | ||
| } | ||
| } | ||
| return result; | ||
| } | ||
|
|
||
| function percentile(sorted: number[], percentileValue: number): number { | ||
| if (sorted.length === 0) { | ||
| return NaN; | ||
| } | ||
| const index = (sorted.length - 1) * percentileValue; | ||
| const lower = Math.floor(index); | ||
| const upper = Math.ceil(index); | ||
| if (lower === upper) { | ||
| return sorted[lower]; | ||
| } | ||
| return sorted[lower] + (sorted[upper] - sorted[lower]) * (index - lower); | ||
| } | ||
|
|
||
| function aggregate(values: number[], aggregation: Aggregation | string): number { | ||
| if (values.length === 0) { | ||
| return NaN; | ||
| } | ||
| const sorted = [...values].sort((a, b) => a - b); | ||
| switch (aggregation) { | ||
| case 'mean': | ||
| return values.reduce((sum, value) => sum + value, 0) / values.length; | ||
| case 'median': | ||
| case 'p50': | ||
| return percentile(sorted, 0.5); | ||
| case 'p75': | ||
| return percentile(sorted, 0.75); | ||
| case 'p90': | ||
| return percentile(sorted, 0.9); | ||
| case 'p95': | ||
| return percentile(sorted, 0.95); | ||
| default: | ||
| throw new Error(`Unsupported aggregation: ${aggregation}`); | ||
| } | ||
| } | ||
|
|
||
| function createJUnitReport(results: ScenarioResult[]): JUnitReport { | ||
| const suites: JUnitSuite[] = []; | ||
| let totalTests = 0; | ||
| let totalFailures = 0; | ||
|
|
||
| for (const scenario of results) { | ||
| const testcases: JUnitTestCase[] = []; | ||
| for (const metric of scenario.metrics) { | ||
| const testcase: JUnitTestCase = { | ||
| classname: `perf.${scenario.id}`, | ||
| name: `${scenario.id} ${metric.id}`, | ||
| time: '0', | ||
| }; | ||
| if (!metric.passed) { | ||
| testcase.failure = { | ||
| message: `Budget exceeded for ${metric.id}`, | ||
| details: `Expected ${metric.aggregation} <= ${metric.threshold}${metric.unit}, observed ${metric.value.toFixed(2)}${metric.unit}`, | ||
| }; | ||
| totalFailures += 1; | ||
| } | ||
| testcases.push(testcase); | ||
| totalTests += 1; | ||
| } | ||
| suites.push({ | ||
| name: scenario.id, | ||
| tests: scenario.metrics.length, | ||
| failures: scenario.metrics.filter((metric) => !metric.passed).length, | ||
| time: '0', | ||
| testcases, | ||
| }); | ||
| } | ||
|
|
||
| return { | ||
| name: 'perf-budget', | ||
| tests: totalTests, | ||
| failures: totalFailures, | ||
| time: '0', | ||
| suites, | ||
| }; | ||
| } | ||
|
|
||
| function writeJUnitReport(report: JUnitReport) { | ||
| const xmlLines: string[] = []; | ||
| xmlLines.push('<?xml version="1.0" encoding="UTF-8"?>'); | ||
| xmlLines.push( | ||
| `<testsuites name="${report.name}" tests="${report.tests}" failures="${report.failures}" time="${report.time}">`, | ||
| ); | ||
| for (const suite of report.suites) { | ||
| xmlLines.push( | ||
| ` <testsuite name="${suite.name}" tests="${suite.tests}" failures="${suite.failures}" time="${suite.time}">`, | ||
| ); | ||
| for (const testcase of suite.testcases) { | ||
| xmlLines.push( | ||
| ` <testcase classname="${testcase.classname}" name="${testcase.name}" time="${testcase.time}">`, | ||
| ); | ||
| if (testcase.failure) { | ||
| xmlLines.push( | ||
| ` <failure message="${escapeXml(testcase.failure.message)}">${escapeXml(testcase.failure.details)}</failure>`, | ||
| ); | ||
| } | ||
| xmlLines.push(' </testcase>'); | ||
| } | ||
| xmlLines.push(' </testsuite>'); | ||
| } | ||
| xmlLines.push('</testsuites>'); | ||
|
|
||
| fs.mkdirSync(resultsDir, { recursive: true }); | ||
| fs.writeFileSync(path.join(resultsDir, 'perf-budget-junit.xml'), xmlLines.join('\n'), 'utf-8'); | ||
| } | ||
|
|
||
| function escapeXml(value: string): string { | ||
| return value | ||
| .replace(/&/g, '&') | ||
| .replace(/"/g, '"') | ||
| .replace(/</g, '<') | ||
| .replace(/>/g, '>'); | ||
| } | ||
|
|
||
| function writeSummary(summary: Summary) { | ||
| fs.mkdirSync(resultsDir, { recursive: true }); | ||
| fs.writeFileSync(path.join(resultsDir, 'perf-budget-summary.json'), JSON.stringify(summary, null, 2), 'utf-8'); | ||
| } | ||
|
|
||
| async function run() { | ||
| const config = readConfig(); | ||
| const defaults = config.defaults ?? {}; | ||
| const browser = await chromium.launch({ headless: true }); | ||
| const scenarioResults: ScenarioResult[] = []; | ||
| let overallPassed = true; | ||
|
|
||
| try { | ||
| for (const scenario of config.scenarios) { | ||
| const runCount = scenario.run_count ?? defaults.run_count ?? 1; | ||
| const runMetrics: RunMetrics[] = []; | ||
| for (let i = 0; i < runCount; i += 1) { | ||
| console.log(`Running scenario ${scenario.id} (${i + 1}/${runCount})`); | ||
| const metrics = await executeScenarioRun(scenario, browser, defaults); | ||
| runMetrics.push(metrics); | ||
| } | ||
| const scenarioValues = collectScenarioMetrics(runMetrics); | ||
| const aggregatedMetrics: AggregatedMetric[] = []; | ||
| for (const metricConfig of scenario.metrics) { | ||
| const values = scenarioValues[metricConfig.id]; | ||
| if (!values || values.length === 0) { | ||
| console.warn(`No samples collected for ${scenario.id} metric ${metricConfig.id}`); | ||
| continue; | ||
| } | ||
| const value = aggregate(values, metricConfig.aggregation); | ||
| const passed = value <= metricConfig.threshold; | ||
| if (!passed) { | ||
| overallPassed = false; | ||
| } | ||
| aggregatedMetrics.push({ | ||
| id: metricConfig.id, | ||
| aggregation: metricConfig.aggregation, | ||
| threshold: metricConfig.threshold, | ||
| unit: metricConfig.unit, | ||
| value, | ||
| passed, | ||
| samples: values, | ||
| }); | ||
| console.log( | ||
| `${scenario.id} ${metricConfig.id} ${metricConfig.aggregation}: ${value.toFixed(2)}${metricConfig.unit} (threshold ${metricConfig.threshold}${metricConfig.unit})`, | ||
| ); | ||
| } | ||
| scenarioResults.push({ | ||
| id: scenario.id, | ||
| description: scenario.description, | ||
| metrics: aggregatedMetrics, | ||
| }); | ||
| } | ||
| } finally { | ||
| await browser.close(); | ||
| } | ||
|
|
||
| const summary: Summary = { | ||
| scenarios: scenarioResults, | ||
| passed: overallPassed, | ||
| }; | ||
| writeSummary(summary); | ||
| writeJUnitReport(createJUnitReport(scenarioResults)); | ||
|
|
||
| if (!overallPassed) { | ||
| console.error('Performance budgets failed. See perf-results/perf-budget-summary.json for details.'); | ||
| process.exitCode = 1; | ||
| } else { | ||
| console.log('Performance budgets satisfied.'); | ||
| } | ||
| } | ||
|
|
||
| run().catch((error) => { | ||
| console.error('Failed to execute performance budgets', error); | ||
| process.exitCode = 1; | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Solid, deterministic perf runner; consider small improvements
- Add optional diagnostics to aid failures (console, page errors, recent requests) before evaluating metrics. Playwright 1.56 exposes page.consoleMessages(), page.pageErrors(), page.requests() you can log on failure.
- Allow per-scenario throttling override (extend ScenarioConfig) to model mixed flows.
- Minor type nit: MetricConfig.aggregation already covers p75/p90/p95 via Aggregation; the union is redundant.
Based on learnings
🤖 Prompt for AI Agents
In frontend/tests/perf/run-perf-budget.ts around lines 100-466, implement three
small changes: (1) add optional diagnostics on failing scenarios by capturing
and logging page.consoleMessages(), page.pageErrors(), and page.requests() just
before evaluating metrics (or when overallPassed flips false) — collect them
from the page/context and include their recent entries in console.error so
failures include context; (2) support per-scenario throttling by extending the
runner to prefer scenario.throttling over defaults (pass scenario.throttling
into applyThrottling when executing a scenario run, and update types if needed);
(3) remove the redundant union usage for metric aggregation by ensuring
functions accept the Aggregation type (not Aggregation | string) — adjust
aggregate() signature/usage and any type annotations so aggregation is typed as
Aggregation only. Ensure no new runtime exceptions are introduced and keep
logging guarded so diagnostics only run when available.
| return CanaryMetrics( | ||
| latency_p95_ms=float(latency) if latency is not None else 0.0, | ||
| error_rate=float(error_rate) if error_rate is not None else 0.0, | ||
| trace_latency_p95_ms=trace_latency, | ||
| previous_latency_p95_ms=_float_or_none(previous_latency), | ||
| previous_error_rate=_float_or_none(previous_error), | ||
| build=build_tag, | ||
| previous_build=previous_build, | ||
| generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), | ||
| ) |
There was a problem hiding this comment.
Missing metrics silently pass the check
When Prometheus queries fail or return no data, latency/error_rate stay None, we coerce them to 0.0, and _evaluate happily passes the run. That masks real outages/regressions whenever the scraper is down or a query is misconfigured. Please surface this as a failure (raise, or at least refuse to build a metrics object) instead of defaulting to zeroes so CI actually fails when telemetry is missing.
Apply this change:
- return CanaryMetrics(
- latency_p95_ms=float(latency) if latency is not None else 0.0,
- error_rate=float(error_rate) if error_rate is not None else 0.0,
+ if latency is None or error_rate is None:
+ raise CanaryCheckError("Prometheus queries returned no data for latency/error rate")
+
+ return CanaryMetrics(
+ latency_p95_ms=float(latency),
+ error_rate=float(error_rate),
trace_latency_p95_ms=trace_latency,
previous_latency_p95_ms=_float_or_none(previous_latency),
previous_error_rate=_float_or_none(previous_error),
build=build_tag,
previous_build=previous_build,
generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return CanaryMetrics( | |
| latency_p95_ms=float(latency) if latency is not None else 0.0, | |
| error_rate=float(error_rate) if error_rate is not None else 0.0, | |
| trace_latency_p95_ms=trace_latency, | |
| previous_latency_p95_ms=_float_or_none(previous_latency), | |
| previous_error_rate=_float_or_none(previous_error), | |
| build=build_tag, | |
| previous_build=previous_build, | |
| generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), | |
| ) | |
| if latency is None or error_rate is None: | |
| raise CanaryCheckError("Prometheus queries returned no data for latency/error rate") | |
| return CanaryMetrics( | |
| latency_p95_ms=float(latency), | |
| error_rate=float(error_rate), | |
| trace_latency_p95_ms=trace_latency, | |
| previous_latency_p95_ms=_float_or_none(previous_latency), | |
| previous_error_rate=_float_or_none(previous_error), | |
| build=build_tag, | |
| previous_build=previous_build, | |
| generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), | |
| ) |
There was a problem hiding this comment.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
scripts/ci/check_canary_metrics.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.0)
scripts/ci/check_canary_metrics.py
1-1: Shebang is present but file is not executable
(EXE001)
13-13: typing.Dict is deprecated, use dict instead
(UP035)
32-32: Dynamically typed expressions (typing.Any) are disallowed in value
(ANN401)
41-41: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.
(S310)
74-74: Unnecessary mode argument
Remove mode argument
(UP015)
160-160: Boolean-typed positional argument in function definition
(FBT001)
🔇 Additional comments (4)
scripts/ci/check_canary_metrics.py (4)
16-30: LGTM!The data structures are well-defined and appropriate for the canary metrics validation use case.
32-71: LGTM!The helper functions properly handle error cases and return
Noneon failures, which enables the fallback logic. The URL open security warning (S310) is low risk here since URLs come from CI-controlled environment variables.
92-148: LGTM! Previous critical issue resolved.The function now properly returns
Nonewhen Prometheus queries fail (lines 105-111), which triggers the fixture fallback in_load_metrics. This addresses the critical issue from previous reviews where missing data would be coerced to0.0and incorrectly pass validation.
178-245: LGTM! Previous formatting issue resolved.The evaluation logic correctly checks metrics against budgets and regression thresholds. The comparison printing (lines 234-242) now properly checks each optional value independently before formatting, resolving the
TypeErrorissue from previous reviews.
| @@ -0,0 +1,248 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Make the script executable.
The shebang is present but the file lacks execute permissions. Run chmod +x scripts/ci/check_canary_metrics.py to make it executable, which is standard practice for scripts with shebangs.
🧰 Tools
🪛 Ruff (0.14.0)
1-1: Shebang is present but file is not executable
(EXE001)
🤖 Prompt for AI Agents
In scripts/ci/check_canary_metrics.py at line 1 the file has a shebang but is
not executable; fix by setting the execute bit on the file (e.g. run chmod +x
scripts/ci/check_canary_metrics.py) so it can be executed directly in CI and
locally.
| import urllib.parse | ||
| import urllib.request | ||
| from dataclasses import dataclass | ||
| from typing import Any, Dict, Optional |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use modern type hint syntax.
typing.Dict is deprecated in Python 3.9+. Use the built-in dict type instead.
Apply this diff:
-from typing import Any, Dict, Optional
+from typing import Any, OptionalThen update line 39:
-def _http_get_json(url: str, params: Optional[Dict[str, str]] = None) -> Dict[str, Any]:
+def _http_get_json(url: str, params: Optional[dict[str, str]] = None) -> dict[str, Any]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from typing import Any, Dict, Optional | |
| from typing import Any, Optional | |
| def _http_get_json(url: str, params: Optional[dict[str, str]] = None) -> dict[str, Any]: |
🧰 Tools
🪛 Ruff (0.14.0)
13-13: typing.Dict is deprecated, use dict instead
(UP035)
🤖 Prompt for AI Agents
In scripts/ci/check_canary_metrics.py around line 13, replace the deprecated
typing.Dict import by removing Dict from the typing import (e.g., from typing
import Any, Optional). Then on line 39 update any type annotation using
Dict[...] to use the built-in generic syntax dict[...] (for example Dict[str,
Any] -> dict[str, Any]) ensuring you keep Any/Optional from typing as needed.
| def _load_metrics() -> CanaryMetrics: | ||
| metrics = _collect_metrics_from_services() | ||
| if metrics: | ||
| return metrics | ||
|
|
||
| fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json") | ||
| print(f"Using canary metrics fixture: {fixture_path}") | ||
| return _load_fixture(fixture_path) | ||
|
|
There was a problem hiding this comment.
Add error handling for missing fixture file.
If the fixture file doesn't exist, _load_fixture raises FileNotFoundError with an unclear message. Add a try-except to provide a helpful error message indicating both the service connection failure and the missing fixture.
Apply this diff:
def _load_metrics() -> CanaryMetrics:
metrics = _collect_metrics_from_services()
if metrics:
return metrics
fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json")
print(f"Using canary metrics fixture: {fixture_path}")
- return _load_fixture(fixture_path)
+ try:
+ return _load_fixture(fixture_path)
+ except FileNotFoundError:
+ raise CanaryCheckError(
+ f"Cannot load metrics: Prometheus queries failed and fixture not found at {fixture_path}"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _load_metrics() -> CanaryMetrics: | |
| metrics = _collect_metrics_from_services() | |
| if metrics: | |
| return metrics | |
| fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json") | |
| print(f"Using canary metrics fixture: {fixture_path}") | |
| return _load_fixture(fixture_path) | |
| def _load_metrics() -> CanaryMetrics: | |
| metrics = _collect_metrics_from_services() | |
| if metrics: | |
| return metrics | |
| fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json") | |
| print(f"Using canary metrics fixture: {fixture_path}") | |
| try: | |
| return _load_fixture(fixture_path) | |
| except FileNotFoundError: | |
| raise CanaryCheckError( | |
| f"Cannot load metrics: Prometheus queries failed and fixture not found at {fixture_path}" | |
| ) |
🤖 Prompt for AI Agents
In scripts/ci/check_canary_metrics.py around lines 150 to 158, _load_metrics
falls back to loading a fixture but doesn't handle the case where the fixture
file is missing; wrap the call to _load_fixture(fixture_path) in a try/except
that catches FileNotFoundError and re-raises a clearer error (or exit) that
states the service metrics collection failed and the fixture file at
fixture_path was not found, including the fixture_path in the message and an
actionable hint to create or point CANARY_METRICS_FIXTURE to a valid file.
| def _write_summary(metrics: CanaryMetrics, passed: bool) -> None: | ||
| summary_path = os.path.join("perf-results", "canary-metrics-summary.json") | ||
| os.makedirs(os.path.dirname(summary_path), exist_ok=True) | ||
| payload = { | ||
| "latency_p95_ms": metrics.latency_p95_ms, | ||
| "error_rate": metrics.error_rate, | ||
| "trace_latency_p95_ms": metrics.trace_latency_p95_ms, | ||
| "previous_latency_p95_ms": metrics.previous_latency_p95_ms, | ||
| "previous_error_rate": metrics.previous_error_rate, | ||
| "build": metrics.build, | ||
| "previous_build": metrics.previous_build, | ||
| "generated_at": metrics.generated_at, | ||
| "passed": passed, | ||
| } | ||
| with open(summary_path, "w", encoding="utf-8") as handle: | ||
| json.dump(payload, handle, indent=2) | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using a keyword argument for the boolean parameter.
The passed parameter is a boolean positional argument, which can reduce readability at call sites. Consider making it keyword-only or using an enum for clarity.
Apply this diff to make it keyword-only:
-def _write_summary(metrics: CanaryMetrics, passed: bool) -> None:
+def _write_summary(metrics: CanaryMetrics, *, passed: bool) -> None:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _write_summary(metrics: CanaryMetrics, passed: bool) -> None: | |
| summary_path = os.path.join("perf-results", "canary-metrics-summary.json") | |
| os.makedirs(os.path.dirname(summary_path), exist_ok=True) | |
| payload = { | |
| "latency_p95_ms": metrics.latency_p95_ms, | |
| "error_rate": metrics.error_rate, | |
| "trace_latency_p95_ms": metrics.trace_latency_p95_ms, | |
| "previous_latency_p95_ms": metrics.previous_latency_p95_ms, | |
| "previous_error_rate": metrics.previous_error_rate, | |
| "build": metrics.build, | |
| "previous_build": metrics.previous_build, | |
| "generated_at": metrics.generated_at, | |
| "passed": passed, | |
| } | |
| with open(summary_path, "w", encoding="utf-8") as handle: | |
| json.dump(payload, handle, indent=2) | |
| def _write_summary(metrics: CanaryMetrics, *, passed: bool) -> None: | |
| summary_path = os.path.join("perf-results", "canary-metrics-summary.json") | |
| os.makedirs(os.path.dirname(summary_path), exist_ok=True) | |
| payload = { | |
| "latency_p95_ms": metrics.latency_p95_ms, | |
| "error_rate": metrics.error_rate, | |
| "trace_latency_p95_ms": metrics.trace_latency_p95_ms, | |
| "previous_latency_p95_ms": metrics.previous_latency_p95_ms, | |
| "previous_error_rate": metrics.previous_error_rate, | |
| "build": metrics.build, | |
| "previous_build": metrics.previous_build, | |
| "generated_at": metrics.generated_at, | |
| "passed": passed, | |
| } | |
| with open(summary_path, "w", encoding="utf-8") as handle: | |
| json.dump(payload, handle, indent=2) |
🧰 Tools
🪛 Ruff (0.14.0)
160-160: Boolean-typed positional argument in function definition
(FBT001)
🤖 Prompt for AI Agents
In scripts/ci/check_canary_metrics.py around lines 160 to 176, the boolean
parameter passed is currently positional which harms readability at call sites;
change the function signature to make passed a keyword-only argument (e.g., def
_write_summary(metrics: CanaryMetrics, *, passed: bool) -> None) and update
every call site to call _write_summary(..., passed=some_bool). Ensure
imports/types remain valid and run tests/lint to catch any missed call sites.
|
Note Docstrings generation - SUCCESS |
…olds` Docstrings generation was requested by @shayancoin. * #123 (comment) The following files were modified: * `frontend/tests/perf/run-perf-budget.ts` * `scripts/ci/check_canary_metrics.py`
…olds` (#282) Docstrings generation was requested by @shayancoin. * #123 (comment) The following files were modified: * `frontend/tests/perf/run-perf-budget.ts` * `scripts/ci/check_canary_metrics.py` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary
.perf-budget.ymlthat covers the homepage-to-configurator journey and direct configurator load metricsTesting
https://chatgpt.com/codex/tasks/task_e_68f135015b3083308d5c389dd9177dbb
Summary by CodeRabbit
New Features
Documentation
Tests
Chores