Add CI performance budgets and canary metrics checks by shayancoin · Pull Request #123 · shayancoin/paform

shayancoin · 2025-10-16T18:22:21Z

Summary

replace the root perf budget file with a richer .perf-budget.yml that covers the homepage-to-configurator journey and direct configurator load metrics
add a Playwright-based runner plus CI job to execute the budgets, emit JUnit/JSON artifacts, and document the release checklist updates in MkDocs
introduce a canary metrics validation script with a CI job that compares Prometheus/Tempo data (or fixtures) against latency and error thresholds

Testing

python scripts/ci/check_canary_metrics.py

https://chatgpt.com/codex/tasks/task_e_68f135015b3083308d5c389dd9177dbb

Summary by CodeRabbit

New Features
- CI now runs automated performance budgets and canary metrics checks across key user flows.
Documentation
- Added a Release Checklist and site navigation entry outlining performance validation and incident response steps.
Tests
- Introduced an end-to-end performance test runner and a canary metrics fixture for CI reporting.
Chores
- Added perf results to ignore rules and updated frontend tooling/dependencies for performance testing.

coderabbitai · 2025-10-16T18:22:48Z

Warning

Rate limit exceeded

@shayancoin has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 5 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9ba2fc4 and 9c26b9b.

📒 Files selected for processing (2)

frontend/tests/perf/run-perf-budget.ts (1 hunks)
scripts/ci/check_canary_metrics.py (1 hunks)

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds CI jobs for performance budgeting and canary metrics, a Playwright-based frontend perf runner with a new .perf-budget.yml, a Python canary metrics checker with fixture fallback, docs for release checks, and related packaging and ignore updates.

Changes

Cohort / File(s)	Summary
CI Workflow `.github/workflows/ci.yml`	Added two jobs: `perf-budgets` (depends on backend/frontend tests; sets up Node, installs deps & Playwright browsers, starts stack, waits for readiness, seeds backend, runs perf-budget runner, uploads artifacts, publishes results, shuts down) and `canary-metrics` (depends on `perf-budgets`, runs canary metrics check script, uploads summary).
Perf config `.perf-budget.yml`, `perf-budget.yml`	Added new `.perf-budget.yml` (version 2) with default throttling and two scenarios (`configurator-load`, `homepage-to-configurator`) including steps, waits, selectors, and metric thresholds; removed the previous `perf-budget.yml` content.
Frontend package & runner `frontend/package.json`, `frontend/tests/perf/run-perf-budget.ts`	Added `perf:budget` npm script; added dependencies (`@react-three/drei`, `@react-three/fiber`, `yaml`, `ts-node`), removed `gltfpack` and `meshoptimizer`. Added TypeScript perf runner that loads YAML config, applies CDP network/CPU throttling, injects PerformanceObserver, runs scenarios, collects metrics (FCP/LCP/TBT/CLS/navigation duration), aggregates percentiles, produces JUnit + JSON reports, and fails CI on budget violations.
Backend canary check & fixture `scripts/ci/check_canary_metrics.py`, `tests/perf/canary-metrics.fixture.json`	Added Python script to query Prometheus/Tempo (with network error handling) or fallback to fixture, evaluate P95 latency and error rate against thresholds and regressions, and write `perf-results/canary-metrics-summary.json`. Added fixture with sample current/previous metrics.
Docs & site nav `docs/release-checklist.md`, `mkdocs.yml`	Added Release Checklist detailing CI perf validation, canary checks, Grafana/Tempo correlation, and sign-off workflow; added nav entry in `mkdocs.yml`.
Git ignore `.gitignore`	Added `perf-results/` to ignore list.

Sequence Diagram(s)

sequenceDiagram
    actor CI
    participant PerfJob as perf-budgets Job
    participant Stack as Stack (API & Frontend)
    participant Runner as run-perf-budget.ts
    participant Artifacts as Artifact Storage

    CI->>PerfJob: Trigger (after backend/frontend tests)
    PerfJob->>Stack: Start services
    PerfJob->>Stack: Seed backend data
    PerfJob->>PerfJob: Wait for readiness

    note right of Runner#lightblue: Load .perf-budget.yml
    loop scenarios
        PerfJob->>Runner: Execute scenario
        activate Runner
        Runner->>Stack: Navigate / perform steps
        note right of Runner#lightgreen: Apply slow-4g throttling\nInject PerformanceObserver
        Runner->>Runner: Collect metrics & aggregate
        deactivate Runner
    end

    alt budgets pass
        Runner->>Artifacts: Write JUnit + JSON (passed)
        Artifacts-->>CI: Success
    else violation
        Runner->>Artifacts: Write JUnit + JSON (failed)
        Artifacts-->>CI: Failure (non-zero)
    end

    PerfJob->>Stack: Shutdown

sequenceDiagram
    actor CI
    participant CanaryJob as canary-metrics Job
    participant Script as check_canary_metrics.py
    participant Prom as Prometheus
    participant Tempo as Tempo
    participant Artifacts as Artifact Storage

    CI->>CanaryJob: Trigger (after perf-budgets)
    CanaryJob->>Script: Invoke (env vars)

    alt Prom/Tempo reachable
        Script->>Prom: Query P95 latency & error rate
        Script->>Tempo: Query trace latency P95
        Prom-->>Script: metrics
        Tempo-->>Script: trace metrics
    else fallback
        Script->>Script: Load fixture JSON
    end

    Script->>Script: Evaluate thresholds & regressions
    alt checks pass
        Script->>Artifacts: Write summary (passed=true)
        Artifacts-->>CI: Success
    else any failure
        Script->>Artifacts: Write summary (passed=false)
        Artifacts-->>CI: Failure (non-zero)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Add performance budget configuration #21: Related perf-budget configuration changes for the configurator page (overlapping selectors, throttling and thresholds), likely touching the same performance test intent and config.

Poem

🐰 I hopped through CI with a perf-test map,

slow-4g dungarees and a tracing cap,
I watched FCP, LCP, TBT in a row,
Prometheus hummed while Tempo did show,
Perf results tucked safe in perf-results' lap.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The description does not follow the repository’s required template because it omits the PR Type, Short Description, and Tests Added sections and instead uses custom headings, leaving out key structured information expected by the template.	Please update the description to match the repository template by adding a “PR Type” field, a concise “Short Description,” and a “Tests Added” section so that all required template sections are present and properly formatted.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly captures the primary additions of CI performance budgets and canary metrics checks and clearly reflects the main purpose of the pull request without extra noise.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-10-16T18:24:37Z

scripts/ci/check_canary_metrics.py

+def _collect_metrics_from_services() -> Optional[CanaryMetrics]:
+    prom_url = os.environ.get("PROMETHEUS_URL")
+    latency_query = os.environ.get("PROMETHEUS_LATENCY_QUERY")
+    error_query = os.environ.get("PROMETHEUS_ERROR_QUERY")
+    previous_latency_query = os.environ.get("PROMETHEUS_PREVIOUS_LATENCY_QUERY")
+    previous_error_query = os.environ.get("PROMETHEUS_PREVIOUS_ERROR_QUERY")
+
+    if not (prom_url and latency_query and error_query):
+        return None
+
+    latency = _query_prometheus(prom_url, latency_query)
+    error_rate = _query_prometheus(prom_url, error_query)
+    previous_latency = (
+        _query_prometheus(prom_url, previous_latency_query)
+        if previous_latency_query
+        else None
+    )
+    previous_error = (
+        _query_prometheus(prom_url, previous_error_query)
+        if previous_error_query
+        else None
+    )
+
+    tempo_url = os.environ.get("TEMPO_URL")
+    trace_latency_query = os.environ.get("TEMPO_TRACE_QUERY")
+    trace_latency = None
+    if tempo_url and trace_latency_query:
+        try:
+            payload = _http_get_json(
+                f"{tempo_url.rstrip('/')}/api/search", {"q": trace_latency_query}
+            )
+            trace_latency = _extract_prom_value(payload)
+        except urllib.error.URLError as exc:
+            print(f"Failed to query Tempo: {exc}", file=sys.stderr)
+
+    build_tag = os.environ.get("BUILD_TAG") or os.environ.get("GITHUB_SHA")
+    previous_build = os.environ.get("PREVIOUS_BUILD_TAG")
+
+    return CanaryMetrics(
+        latency_p95_ms=float(latency) if latency is not None else 0.0,
+        error_rate=float(error_rate) if error_rate is not None else 0.0,
+        trace_latency_p95_ms=trace_latency,


Fail canary check when Prometheus data is unavailable

When a Prometheus/Tempo query fails or returns no samples, _query_prometheus returns None, yet _collect_metrics_from_services converts those None values to 0.0 (latency_p95_ms and error_rate). _evaluate treats those zeros as being within budget, so network outages or misconfigured queries will silently mark the canary job as passing without validating real metrics. The check should raise an error or fall back to the fixture when no data is retrieved instead of assuming zero latency/error.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2025-10-16T18:24:37Z

scripts/ci/check_canary_metrics.py

+    print(
+        "Canary metrics within thresholds. "
+        f"latency_p95_ms={metrics.latency_p95_ms:.2f}, "
+        f"error_rate={metrics.error_rate:.4f}"
+    )
+    if metrics.trace_latency_p95_ms is not None:
+        print(f"Tempo trace P95 latency: {metrics.trace_latency_p95_ms:.2f}ms")
+    if metrics.previous_latency_p95_ms is not None:
+        print(
+            "Compared to previous build: "
+            f"latency_p95_ms={metrics.previous_latency_p95_ms:.2f}, "
+            f"error_rate={metrics.previous_error_rate:.4f}"


Guard printing previous error rate when it is absent

The success log formats metrics.previous_error_rate with :.4f whenever previous_latency_p95_ms is set, but it does not first verify that the error rate is also present. If only PROMETHEUS_PREVIOUS_LATENCY_QUERY is configured (common when tracking latency regressions but not errors), previous_error_rate remains None and this formatting raises TypeError, causing the script to exit after the metrics check already passed. Check both values before formatting or default the missing one.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 7

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2a35c4 and 9aee543.

⛔ Files ignored due to path filters (1)

frontend/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (10)

.github/workflows/ci.yml (1 hunks)
.gitignore (1 hunks)
.perf-budget.yml (1 hunks)
docs/release-checklist.md (1 hunks)
frontend/package.json (3 hunks)
frontend/tests/perf/run-perf-budget.ts (1 hunks)
mkdocs.yml (1 hunks)
perf-budget.yml (0 hunks)
scripts/ci/check_canary_metrics.py (1 hunks)
tests/perf/canary-metrics.fixture.json (1 hunks)

💤 Files with no reviewable changes (1)

perf-budget.yml

🧰 Additional context used

🧬 Code graph analysis (3)

.github/workflows/ci.yml (1)

tests/perf/k6-quote-cnc.js (1)

data (85-128)

scripts/ci/check_canary_metrics.py (1)

backend/api/main.py (1)

metrics (105-107)

frontend/tests/perf/run-perf-budget.ts (2)

tests/perf/k6-quote-cnc.js (2)

file (107-107)

data (85-128)

backend/api/main.py (1)

metrics (105-107)

🪛 actionlint (1.7.8)

.github/workflows/ci.yml

146-146: could not parse as YAML: could not find expected ':'

(syntax-check)

🪛 LanguageTool

docs/release-checklist.md

[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist This checklist connects the CI performan...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~4-~4: There might be a mistake here.
Context: ...ion releases. It is intended for release managers and on-call engineers who need ...

(QB_NEW_EN)

[grammar] ~5-~5: There might be a mistake here.
Context: ...ble flow that ties Grafana alerts, Tempo traces, and CI preview environments toge...

(QB_NEW_EN)

[grammar] ~6-~6: Use correct spacing
Context: ...s, and CI preview environments together. ## 1. Validate CI performance budgets 1. C...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~8-~8: There might be a mistake here.
Context: .... ## 1. Validate CI performance budgets 1. Confirm the Performance Budgets job ...

(QB_NEW_EN_OTHER)

[grammar] ~12-~12: Use correct spacing
Context: ...son` for the concrete values captured by Playwright (P75 configurator load, P90 n...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~13-~13: There might be a mistake here.
Context: ...load, P90 navigation duration, P95 LCP). 2. If any threshold failed, investigate bef...

(QB_NEW_EN)

[grammar] ~14-~14: There might be a mistake here.
Context: ... before promoting the release candidate: * Re-run the job against the preview envir...

(QB_NEW_EN)

[grammar] ~15-~15: Use correct spacing
Context: ...ment to determine whether the regression is deterministic or environment-specific...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~16-~16: There might be a mistake here.
Context: ...s deterministic or environment-specific. * Capture a Grafana dashboard snapshot sho...

(QB_NEW_EN)

[grammar] ~17-~17: Use correct spacing
Context: ...g the relevant Web Vitals panel and link it in the incident tracker. ## 2. Check...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~18-~18: Use correct spacing
Context: ...nd link it in the incident tracker. ## 2. Check canary latency and error regres...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~20-~20: There might be a mistake here.
Context: ...eck canary latency and error regressions 1. Verify the Canary Metrics job comple...

(QB_NEW_EN_OTHER)

[grammar] ~23-~23: Use correct spacing
Context: ...json` to compare the current build's P95 latency and error rate with the previous...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~24-~24: There might be a mistake here.
Context: ...evious build and the configured budgets. 3. If the job failed: * Acknowledge or s...

(QB_NEW_EN)

[grammar] ~25-~25: Use colons correctly
Context: ...d the configured budgets. 3. If the job failed: * Acknowledge or silence the corresponding Grafana al...

(QB_NEW_EN_OTHER_ERROR_IDS_30)

[grammar] ~26-~26: Use correct spacing
Context: ...ding Grafana alert for the service-level objective (SLO). * Escalate to the on...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~27-~27: There might be a mistake here.
Context: ... the service-level objective (SLO). * Escalate to the on-call engineer through...

(QB_NEW_EN)

[grammar] ~28-~28: Use correct spacing
Context: ...he paging integration configured for the Grafana alert rule (OpsGenie, PagerDuty,...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~29-~29: There might be a mistake here.
Context: ... alert rule (OpsGenie, PagerDuty, etc.). * Use the Tempo trace search link emitted ...

(QB_NEW_EN)

[grammar] ~30-~30: Use correct spacing
Context: ... the job (or Grafana Explore) to confirm whether elevated latency correlates with...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~31-~31: Use correct spacing
Context: ... latency correlates with specific spans. ## 3. Correlate CI results with Grafana das...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~33-~33: There might be a mistake here.
Context: ...elate CI results with Grafana dashboards 1. Publish the artifacts from the CI run (`...

(QB_NEW_EN_OTHER)

[grammar] ~35-~35: Use correct spacing
Context: ...e CI run (perf-budget-summary.json and canary-metrics-summary.json) to the sh...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~36-~36: There might be a mistake here.
Context: ...to the shared release channel or ticket. 2. Update the Grafana dashboard annotations...

(QB_NEW_EN)

[grammar] ~37-~37: Use correct spacing
Context: ...the build ID and a link to the CI run so that on-call engineers can quickly corre...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~38-~38: There might be a mistake here.
Context: ...elate spikes with the release candidate. 3. Ensure the Grafana dashboard panels for ...

(QB_NEW_EN)

[grammar] ~39-~39: Use correct spacing
Context: ...d Budgets" and "Runtime Latency" use the Pushgateway/JUnit exports for side-by-si...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~40-~40: Use correct spacing
Context: ...it exports for side-by-side comparisons. ## 4. Preview gating and release sign-off ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~42-~42: Use correct spacing
Context: ...# 4. Preview gating and release sign-off 1. Block the promotion of the preview envir...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~44-~44: Use correct spacing
Context: ...ronment to staging/production until both performance-related jobs have succeeded ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~45-~45: There might be a mistake here.
Context: ...succeeded for the release branch commit. 2. Confirm no active Grafana alerts remain ...

(QB_NEW_EN)

[grammar] ~46-~46: Use correct spacing
Context: ...for the release window. If alerts exist, document mitigations and obtain sign-off...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

[grammar] ~47-~47: There might be a mistake here.
Context: ...he incident commander before proceeding. 3. Log the final decision in the release re...

(QB_NEW_EN)

[grammar] ~48-~48: There might be a mistake here.
Context: ...decision in the release record, linking to: * The CI run with passing performance jobs. ...

(QB_NEW_EN_OTHER)

[grammar] ~49-~49: There might be a mistake here.
Context: ...he CI run with passing performance jobs. * Grafana alert history or dashboards show...

(QB_NEW_EN)

[grammar] ~51-~51: Use correct spacing
Context: ...at justify the decision when applicable. Following this checklist guarantees that...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🪛 Ruff (0.14.0)

scripts/ci/check_canary_metrics.py

1-1: Shebang is present but file is not executable

(EXE001)

13-13: typing.Dict is deprecated, use dict instead

(UP035)

32-32: Dynamically typed expressions (typing.Any) are disallowed in value

(ANN401)

41-41: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

74-74: Unnecessary mode argument

Remove mode argument

(UP015)

152-152: Boolean-typed positional argument in function definition

(FBT001)

🪛 YAMLlint (1.37.1)

.github/workflows/ci.yml

[error] 147-147: syntax error: could not find expected ':'

(syntax)

🔇 Additional comments (2)

.perf-budget.yml (1)

1-66: Perf budget config looks consistent with the runner

Defaults, throttling, waits, steps, and metric IDs match the runner. Thresholds and aggregations are valid.

Optional:

If you want per-scenario throttling overrides, extend ScenarioConfig to include throttling and pass it to applyThrottling.

Consider adding FCP to homepage-to-configurator for additional coverage.

docs/release-checklist.md (1)

1-54: Clear, actionable checklist aligned with CI artifacts

Matches job names and artifact paths. No blockers.

coderabbitai · 2025-10-16T18:40:16Z

.github/workflows/ci.yml

+      - name: Set up Node.js
+        uses: actions/setup-node@0a44ba78451273a1ed8ac2fee4e347c72dfd377f
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: ./frontend/package-lock.json
+
+      - name: Install dependencies
+        working-directory: ./frontend
+        run: npm ci
+
+      - name: Install Playwright browsers
+        working-directory: ./frontend
+        run: npx playwright install --with-deps chromium
+
+      - name: Start stack
+        run: docker compose --env-file .env.development -f docker-compose.dev.yml up -d --build


⚠️ Potential issue | 🔴 Critical

Install Python before using it in perf-budgets job

The job invokes python to seed the backend but never sets up Python. On hosted runners, python may not be available as python (often python3 only), causing a hard failure.

Apply this diff to add setup-python (mirrors earlier jobs):

- name: Set up Node.js uses: actions/setup-node@0a44ba78451273a1ed8ac2fee4e347c72dfd377f with: node-version: '20' cache: 'npm' cache-dependency-path: ./frontend/package-lock.json + + - name: Set up Python + uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c + with: + python-version: '3.12'

🤖 Prompt for AI Agents

.github/workflows/ci.yml around lines 115 to 131: the perf-budgets job runs Python to seed the backend but never installs or configures Python, which can fail on hosted runners where only python3 is available; add a step before any python invocation that uses actions/setup-python (e.g. actions/setup-python@v4) with python-version: '3.x' (or a specific 3.11/3.10 as used elsewhere) so the runner exposes the python binary, and mirror any cache/config from earlier jobs if needed.

coderabbitai · 2025-10-16T18:40:17Z

.github/workflows/ci.yml

+      - name: Wait for API
+        run: |
+          for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done
+
+      - name: Wait for Frontend
+        run: |
+          for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done
+


⚠️ Potential issue | 🔴 Critical

Make readiness loops fail if services never become ready

Current for-loops can exit successfully after the final sleep even if curl never succeeds, letting the job proceed in a bad state.

Apply this diff to both API and Frontend waits:

- - name: Wait for API - run: | - for i in {1..60}; do curl -sf http://localhost:8000/healthcheck && break || sleep 2; done + - name: Wait for API + run: | + for i in {1..60}; do + if curl -sf http://localhost:8000/healthcheck; then + echo "API ready" + exit 0 + fi + sleep 2 + done + echo "API failed to become ready in time" >&2 + exit 1 @@ - - name: Wait for Frontend - run: | - for i in {1..60}; do curl -sf http://localhost:3000/models/manifest.json && break || sleep 2; done + - name: Wait for Frontend + run: | + for i in {1..60}; do + if curl -sf http://localhost:3000/models/manifest.json; then + echo "Frontend ready" + exit 0 + fi + sleep 2 + done + echo "Frontend failed to become ready in time" >&2 + exit 1

🤖 Prompt for AI Agents

In .github/workflows/ci.yml around lines 133 to 140, the readiness for-loops currently can complete quietly even if curl never succeeded; change each block so the workflow fails when the service never becomes ready by adding an explicit failure after the loop (e.g., run the curl check once more and if it still fails echo an error and exit 1, or set a flag inside the loop and after the loop test it and exit 1 on failure), applying this to both the API and Frontend wait steps.

coderabbitai · 2025-10-16T18:40:17Z

.github/workflows/ci.yml

+          python - <<'PY'
+import json
+import os
+import urllib.error
+import urllib.request
+
+BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
+
+def post(path: str, payload: dict) -> dict:
+    req = urllib.request.Request(
+        f"{BASE_URL}{path}",
+        data=json.dumps(payload).encode("utf-8"),
+        headers={"Content-Type": "application/json"},
+    )
+    try:
+        with urllib.request.urlopen(req, timeout=10) as resp:
+            return json.loads(resp.read().decode("utf-8"))
+    except urllib.error.HTTPError as exc:
+        detail = exc.read().decode("utf-8", "ignore")
+        raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
+
+material = post(
+    "/api/materials/",
+    {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
+)
+material_id = material.get("id")
+if not material_id:
+    raise SystemExit("Material creation failed; missing id")
+
+post(
+    "/api/modules/",
+    {
+        "name": "Base600",
+        "width": 600.0,
+        "height": 720.0,
+        "depth": 580.0,
+        "base_price": 100.0,
+        "material_id": material_id,
+    },
+)
+PY


⚠️ Potential issue | 🔴 Critical

Fix YAML block: indent heredoc body consistently (current YAML parse error)

Lines inside the run: | block after “python - <<'PY'” are not indented, so YAML treats them as new keys. Indent the entire heredoc body consistently to satisfy YAML while preserving the script.

Apply this diff:

- name: Seed backend for perf env: BASE_URL: http://localhost:8000 run: | - python - <<'PY' -import json -import os -import urllib.error -import urllib.request - -BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") - -def post(path: str, payload: dict) -> dict: - req = urllib.request.Request( - f"{BASE_URL}{path}", - data=json.dumps(payload).encode("utf-8"), - headers={"Content-Type": "application/json"}, - ) - try: - with urllib.request.urlopen(req, timeout=10) as resp: - return json.loads(resp.read().decode("utf-8")) - except urllib.error.HTTPError as exc: - detail = exc.read().decode("utf-8", "ignore") - raise SystemExit(f"Seed request failed ({exc.code}): {detail}") - -material = post( - "/api/materials/", - {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, -) -material_id = material.get("id") -if not material_id: - raise SystemExit("Material creation failed; missing id") - -post( - "/api/modules/", - { - "name": "Base600", - "width": 600.0, - "height": 720.0, - "depth": 580.0, - "base_price": 100.0, - "material_id": material_id, - }, -) -PY + python - <<'PY' + import json + import os + import urllib.error + import urllib.request + + BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000") + + def post(path: str, payload: dict) -> dict: + req = urllib.request.Request( + f"{BASE_URL}{path}", + data=json.dumps(payload).encode("utf-8"), + headers={"Content-Type": "application/json"}, + ) + try: + with urllib.request.urlopen(req, timeout=10) as resp: + return json.loads(resp.read().decode("utf-8")) + except urllib.error.HTTPError as exc: + detail = exc.read().decode("utf-8", "ignore") + raise SystemExit(f"Seed request failed ({exc.code}): {detail}") + + material = post( + "/api/materials/", + {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5}, + ) + material_id = material.get("id") + if not material_id: + raise SystemExit("Material creation failed; missing id") + + post( + "/api/modules/", + { + "name": "Base600", + "width": 600.0, + "height": 720.0, + "depth": 580.0, + "base_price": 100.0, + "material_id": material_id, + }, + ) + PY

Note: YAML will strip the common indent from the block scalar, so the shell sees the heredoc exactly as intended. The current error matches actionlint/YAMLlint output.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

python - <<'PY'

import json

import os

import urllib.error

import urllib.request

BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")

def post(path: str, payload: dict) -> dict:

req = urllib.request.Request(

f"{BASE_URL}{path}",

data=json.dumps(payload).encode("utf-8"),

headers={"Content-Type": "application/json"},

)

try:

with urllib.request.urlopen(req, timeout=10) as resp:

return json.loads(resp.read().decode("utf-8"))

except urllib.error.HTTPError as exc:

detail = exc.read().decode("utf-8", "ignore")

raise SystemExit(f"Seed request failed ({exc.code}): {detail}")

material = post(

"/api/materials/",

{"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},

)

material_id = material.get("id")

if not material_id:

raise SystemExit("Material creation failed; missing id")

post(

"/api/modules/",

{

"name": "Base600",

"width": 600.0,

"height": 720.0,

"depth": 580.0,

"base_price": 100.0,

"material_id": material_id,

},

)

PY

- name: Seed backend for perf

env:

BASE_URL: http://localhost:8000

run: |

python - <<'PY'

import json

import os

import urllib.error

import urllib.request

BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")

def post(path: str, payload: dict) -> dict:

req = urllib.request.Request(

f"{BASE_URL}{path}",

data=json.dumps(payload).encode("utf-8"),

headers={"Content-Type": "application/json"},

)

try:

with urllib.request.urlopen(req, timeout=10) as resp:

return json.loads(resp.read().decode("utf-8"))

except urllib.error.HTTPError as exc:

detail = exc.read().decode("utf-8", "ignore")

raise SystemExit(f"Seed request failed ({exc.code}): {detail}")

material = post(

"/api/materials/",

{"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},

)

material_id = material.get("id")

if not material_id:

raise SystemExit("Material creation failed; missing id")

post(

"/api/modules/",

{

"name": "Base600",

"width": 600.0,

"height": 720.0,

"depth": 580.0,

"base_price": 100.0,

"material_id": material_id,

},

)

PY

🧰 Tools

🪛 actionlint (1.7.8)

146-146: could not parse as YAML: could not find expected ':'

(syntax-check)

🪛 YAMLlint (1.37.1)

[error] 147-147: syntax error: could not find expected ':'

(syntax)

🤖 Prompt for AI Agents

In .github/workflows/ci.yml around lines 145 to 185 the python heredoc started with "python - <<'PY'" is not indented, causing YAML to treat the script lines as top-level keys; indent the entire heredoc body (every line between the "python - <<'PY'" line and the closing "PY") to the same indentation level as the run block content (e.g., two extra spaces) so YAML parses the block scalar correctly, and keep the closing "PY" delimiter indented to that same level; do not change the script content otherwise (YAML will strip common indentation when passing the heredoc to the shell).

coderabbitai · 2025-10-16T18:40:17Z

.github/workflows/ci.yml

+  canary-metrics:
+    runs-on: ubuntu-latest
+    needs:
+      - perf-budgets
+    steps:
+      - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8
+
+      - name: Evaluate canary metrics
+        env:
+          CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json
+          P95_THRESHOLD_MS: '3000'
+          ERROR_RATE_THRESHOLD: '0.02'
+          REGRESSION_TOLERANCE_PCT: '0.1'
+        run: python scripts/ci/check_canary_metrics.py
+


⚠️ Potential issue | 🔴 Critical

Install Python before running canary metrics script

This job runs python scripts/ci/check_canary_metrics.py without provisioning Python. Add setup-python or invoke python3 only if guaranteed to exist.

Apply this diff:

canary-metrics: runs-on: ubuntu-latest needs: - perf-budgets steps: - uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 + + - name: Set up Python + uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c + with: + python-version: '3.12' @@ - - name: Evaluate canary metrics + - name: Evaluate canary metrics env: CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json P95_THRESHOLD_MS: '3000' ERROR_RATE_THRESHOLD: '0.02' REGRESSION_TOLERANCE_PCT: '0.1' run: python scripts/ci/check_canary_metrics.py

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

canary-metrics:

runs-on: ubuntu-latest

needs:

- perf-budgets

steps:

- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8

- name: Evaluate canary metrics

env:

CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json

P95_THRESHOLD_MS: '3000'

ERROR_RATE_THRESHOLD: '0.02'

REGRESSION_TOLERANCE_PCT: '0.1'

run: python scripts/ci/check_canary_metrics.py

canary-metrics:

runs-on: ubuntu-latest

needs:

- perf-budgets

steps:

- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8

- name: Set up Python

uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c

with:

python-version: '3.12'

- name: Evaluate canary metrics

env:

CANARY_METRICS_FIXTURE: tests/perf/canary-metrics.fixture.json

P95_THRESHOLD_MS: '3000'

ERROR_RATE_THRESHOLD: '0.02'

REGRESSION_TOLERANCE_PCT: '0.1'

run: python scripts/ci/check_canary_metrics.py

🤖 Prompt for AI Agents

In .github/workflows/ci.yml around lines 209 to 223, the canary-metrics job invokes python scripts/ci/check_canary_metrics.py without ensuring Python is available; add a step before the run that sets up Python (e.g. actions/setup-python@v4 with a python-version like '3.x' or a specific '3.11') so the runner has a known python binary on PATH, then keep the existing run step (or change the run to invoke python3 explicitly if you prefer not to add the setup action).

coderabbitai · 2025-10-16T18:40:17Z

frontend/tests/perf/run-perf-budget.ts

+const rootDir = path.resolve(__dirname, '../../..');
+const configPath = path.join(rootDir, '.perf-budget.yml');
+const resultsDir = path.join(rootDir, 'perf-results');
+
+function readConfig(): PerfBudgetConfig {
+  const file = fs.readFileSync(configPath, 'utf-8');
+  const parsed = yaml.parse(file) as PerfBudgetConfig;
+  if (!parsed || !Array.isArray(parsed.scenarios)) {
+    throw new Error('Invalid perf budget configuration: missing scenarios');
+  }
+  return parsed;
+}
+
+async function applyThrottling(context: BrowserContext, page: Page, throttling?: ThrottlingConfig) {
+  if (!throttling) {
+    return;
+  }
+  const session = await context.newCDPSession(page);
+  if (typeof throttling.cpu_slowdown_multiplier === 'number' && throttling.cpu_slowdown_multiplier > 0) {
+    await session.send('Emulation.setCPUThrottlingRate', { rate: throttling.cpu_slowdown_multiplier });
+  }
+  const download = throttling.download_throughput_kbps;
+  const upload = throttling.upload_throughput_kbps;
+  const latency = throttling.request_latency_ms;
+  if (download || upload || latency) {
+    await session.send('Network.enable');
+    await session.send('Network.emulateNetworkConditions', {
+      offline: false,
+      latency: latency ?? 0,
+      downloadThroughput: download ? (download * 1024) / 8 : -1,
+      uploadThroughput: upload ? (upload * 1024) / 8 : -1,
+    });
+  }
+}
+
+async function setupPerformanceObservers(page: Page) {
+  await page.addInitScript(() => {
+    const globalAny = globalThis as any;
+    globalAny.__perfBudget = {
+      lcpEntries: [] as PerformanceEntry[],
+      cls: 0,
+      tbt: 0,
+    };
+
+    if (globalAny.PerformanceObserver) {
+      try {
+        const lcpObserver = new PerformanceObserver((entryList) => {
+          const entries = entryList.getEntries();
+          const store = globalAny.__perfBudget;
+          store.lcpEntries.push(...entries);
+        });
+        lcpObserver.observe({ type: 'largest-contentful-paint', buffered: true });
+      } catch (error) {
+        console.warn('LCP observer failed', error);
+      }
+
+      try {
+        const clsObserver = new PerformanceObserver((entryList) => {
+          const store = globalAny.__perfBudget;
+          for (const entry of entryList.getEntries() as any[]) {
+            if (!entry.hadRecentInput) {
+              store.cls += entry.value;
+            }
+          }
+        });
+        clsObserver.observe({ type: 'layout-shift', buffered: true });
+      } catch (error) {
+        console.warn('CLS observer failed', error);
+      }
+
+      try {
+        const longTaskObserver = new PerformanceObserver((entryList) => {
+          const store = globalAny.__perfBudget;
+          for (const entry of entryList.getEntries()) {
+            const blockingTime = entry.duration - 50;
+            if (blockingTime > 0) {
+              store.tbt += blockingTime;
+            }
+          }
+        });
+        longTaskObserver.observe({ type: 'longtask' });
+      } catch (error) {
+        console.warn('Long task observer failed', error);
+      }
+    }
+  });
+}
+
+async function performWait(page: Page, wait: WaitStep) {
+  if (wait.type === 'selector') {
+    await page.waitForSelector(wait.selector, { timeout: wait.timeout_ms ?? 30000 });
+  } else if (wait.type === 'networkidle') {
+    await page.waitForLoadState('networkidle', { timeout: wait.timeout_ms ?? 60000 });
+    if (wait.idle_ms && wait.idle_ms > 0) {
+      await page.waitForTimeout(wait.idle_ms);
+    }
+  }
+}
+
+async function performScenarioStep(page: Page, step: ScenarioStep) {
+  if (step.type === 'goto') {
+    await page.goto(step.url, { waitUntil: step.wait_until ?? 'load', timeout: 60000 });
+  } else if (step.type === 'wait_for_selector') {
+    await page.waitForSelector(step.selector, { timeout: step.timeout_ms ?? 30000 });
+  } else if (step.type === 'wait_for_timeout') {
+    await page.waitForTimeout(step.timeout_ms);
+  }
+}
+
+async function executeScenarioRun(
+  scenario: ScenarioConfig,
+  browser: Browser,
+  defaults: PerfBudgetConfig['defaults'],
+): Promise<RunMetrics> {
+  const context = await browser.newContext();
+  const page = await context.newPage();
+  await setupPerformanceObservers(page);
+  await applyThrottling(context, page, defaults?.throttling);
+
+  if (scenario.steps && scenario.steps.length > 0) {
+    for (const step of scenario.steps) {
+      await performScenarioStep(page, step);
+    }
+  } else if (scenario.url) {
+    await page.goto(scenario.url, { waitUntil: 'load', timeout: 60000 });
+    if (scenario.waits) {
+      for (const wait of scenario.waits) {
+        await performWait(page, wait);
+      }
+    }
+  } else {
+    throw new Error(`Scenario ${scenario.id} missing url or steps`);
+  }
+
+  // Give observers time to flush buffered events
+  await page.waitForTimeout(1000);
+
+  const metrics = await page.evaluate(() => {
+    const navEntries = performance.getEntriesByType('navigation');
+    const paints = performance.getEntriesByName('first-contentful-paint');
+    const globalAny = globalThis as any;
+    const store = globalAny.__perfBudget || { lcpEntries: [], cls: 0, tbt: 0 };
+    const lcpEntries = Array.isArray(store.lcpEntries) ? store.lcpEntries : [];
+    const lastNav = navEntries[navEntries.length - 1] as PerformanceNavigationTiming | undefined;
+    const fcp = paints.length > 0 ? paints[paints.length - 1].startTime : null;
+    const lcpEntry = lcpEntries.length > 0 ? lcpEntries[lcpEntries.length - 1] : null;
+    const lcp = lcpEntry ? (lcpEntry as any).startTime ?? (lcpEntry as any).renderTime ?? (lcpEntry as any).loadTime ?? null : null;
+    const cls = typeof store.cls === 'number' ? store.cls : null;
+    const tbt = typeof store.tbt === 'number' ? store.tbt : null;
+
+    return {
+      'navigation-duration': lastNav ? lastNav.duration : null,
+      'first-contentful-paint': fcp,
+      'largest-contentful-paint': lcp,
+      'total-blocking-time': tbt,
+      'cumulative-layout-shift': cls,
+    };
+  });
+
+  await context.close();
+  return metrics;
+}
+
+function collectScenarioMetrics(metrics: RunMetrics[]): ScenarioMetrics {
+  const result: ScenarioMetrics = {};
+  for (const run of metrics) {
+    for (const [metricId, value] of Object.entries(run)) {
+      if (typeof value === 'number' && Number.isFinite(value)) {
+        if (!result[metricId]) {
+          result[metricId] = [];
+        }
+        result[metricId].push(value);
+      }
+    }
+  }
+  return result;
+}
+
+function percentile(sorted: number[], percentileValue: number): number {
+  if (sorted.length === 0) {
+    return NaN;
+  }
+  const index = (sorted.length - 1) * percentileValue;
+  const lower = Math.floor(index);
+  const upper = Math.ceil(index);
+  if (lower === upper) {
+    return sorted[lower];
+  }
+  return sorted[lower] + (sorted[upper] - sorted[lower]) * (index - lower);
+}
+
+function aggregate(values: number[], aggregation: Aggregation | string): number {
+  if (values.length === 0) {
+    return NaN;
+  }
+  const sorted = [...values].sort((a, b) => a - b);
+  switch (aggregation) {
+    case 'mean':
+      return values.reduce((sum, value) => sum + value, 0) / values.length;
+    case 'median':
+    case 'p50':
+      return percentile(sorted, 0.5);
+    case 'p75':
+      return percentile(sorted, 0.75);
+    case 'p90':
+      return percentile(sorted, 0.9);
+    case 'p95':
+      return percentile(sorted, 0.95);
+    default:
+      throw new Error(`Unsupported aggregation: ${aggregation}`);
+  }
+}
+
+function createJUnitReport(results: ScenarioResult[]): JUnitReport {
+  const suites: JUnitSuite[] = [];
+  let totalTests = 0;
+  let totalFailures = 0;
+
+  for (const scenario of results) {
+    const testcases: JUnitTestCase[] = [];
+    for (const metric of scenario.metrics) {
+      const testcase: JUnitTestCase = {
+        classname: `perf.${scenario.id}`,
+        name: `${scenario.id} ${metric.id}`,
+        time: '0',
+      };
+      if (!metric.passed) {
+        testcase.failure = {
+          message: `Budget exceeded for ${metric.id}`,
+          details: `Expected ${metric.aggregation} <= ${metric.threshold}${metric.unit}, observed ${metric.value.toFixed(2)}${metric.unit}`,
+        };
+        totalFailures += 1;
+      }
+      testcases.push(testcase);
+      totalTests += 1;
+    }
+    suites.push({
+      name: scenario.id,
+      tests: scenario.metrics.length,
+      failures: scenario.metrics.filter((metric) => !metric.passed).length,
+      time: '0',
+      testcases,
+    });
+  }
+
+  return {
+    name: 'perf-budget',
+    tests: totalTests,
+    failures: totalFailures,
+    time: '0',
+    suites,
+  };
+}
+
+function writeJUnitReport(report: JUnitReport) {
+  const xmlLines: string[] = [];
+  xmlLines.push('<?xml version="1.0" encoding="UTF-8"?>');
+  xmlLines.push(
+    `<testsuites name="${report.name}" tests="${report.tests}" failures="${report.failures}" time="${report.time}">`,
+  );
+  for (const suite of report.suites) {
+    xmlLines.push(
+      `  <testsuite name="${suite.name}" tests="${suite.tests}" failures="${suite.failures}" time="${suite.time}">`,
+    );
+    for (const testcase of suite.testcases) {
+      xmlLines.push(
+        `    <testcase classname="${testcase.classname}" name="${testcase.name}" time="${testcase.time}">`,
+      );
+      if (testcase.failure) {
+        xmlLines.push(
+          `      <failure message="${escapeXml(testcase.failure.message)}">${escapeXml(testcase.failure.details)}</failure>`,
+        );
+      }
+      xmlLines.push('    </testcase>');
+    }
+    xmlLines.push('  </testsuite>');
+  }
+  xmlLines.push('</testsuites>');
+
+  fs.mkdirSync(resultsDir, { recursive: true });
+  fs.writeFileSync(path.join(resultsDir, 'perf-budget-junit.xml'), xmlLines.join('\n'), 'utf-8');
+}
+
+function escapeXml(value: string): string {
+  return value
+    .replace(/&/g, '&amp;')
+    .replace(/"/g, '&quot;')
+    .replace(/</g, '&lt;')
+    .replace(/>/g, '&gt;');
+}
+
+function writeSummary(summary: Summary) {
+  fs.mkdirSync(resultsDir, { recursive: true });
+  fs.writeFileSync(path.join(resultsDir, 'perf-budget-summary.json'), JSON.stringify(summary, null, 2), 'utf-8');
+}
+
+async function run() {
+  const config = readConfig();
+  const defaults = config.defaults ?? {};
+  const browser = await chromium.launch({ headless: true });
+  const scenarioResults: ScenarioResult[] = [];
+  let overallPassed = true;
+
+  try {
+    for (const scenario of config.scenarios) {
+      const runCount = scenario.run_count ?? defaults.run_count ?? 1;
+      const runMetrics: RunMetrics[] = [];
+      for (let i = 0; i < runCount; i += 1) {
+        console.log(`Running scenario ${scenario.id} (${i + 1}/${runCount})`);
+        const metrics = await executeScenarioRun(scenario, browser, defaults);
+        runMetrics.push(metrics);
+      }
+      const scenarioValues = collectScenarioMetrics(runMetrics);
+      const aggregatedMetrics: AggregatedMetric[] = [];
+      for (const metricConfig of scenario.metrics) {
+        const values = scenarioValues[metricConfig.id];
+        if (!values || values.length === 0) {
+          console.warn(`No samples collected for ${scenario.id} metric ${metricConfig.id}`);
+          continue;
+        }
+        const value = aggregate(values, metricConfig.aggregation);
+        const passed = value <= metricConfig.threshold;
+        if (!passed) {
+          overallPassed = false;
+        }
+        aggregatedMetrics.push({
+          id: metricConfig.id,
+          aggregation: metricConfig.aggregation,
+          threshold: metricConfig.threshold,
+          unit: metricConfig.unit,
+          value,
+          passed,
+          samples: values,
+        });
+        console.log(
+          `${scenario.id} ${metricConfig.id} ${metricConfig.aggregation}: ${value.toFixed(2)}${metricConfig.unit} (threshold ${metricConfig.threshold}${metricConfig.unit})`,
+        );
+      }
+      scenarioResults.push({
+        id: scenario.id,
+        description: scenario.description,
+        metrics: aggregatedMetrics,
+      });
+    }
+  } finally {
+    await browser.close();
+  }
+
+  const summary: Summary = {
+    scenarios: scenarioResults,
+    passed: overallPassed,
+  };
+  writeSummary(summary);
+  writeJUnitReport(createJUnitReport(scenarioResults));
+
+  if (!overallPassed) {
+    console.error('Performance budgets failed. See perf-results/perf-budget-summary.json for details.');
+    process.exitCode = 1;
+  } else {
+    console.log('Performance budgets satisfied.');
+  }
+}
+
+run().catch((error) => {
+  console.error('Failed to execute performance budgets', error);
+  process.exitCode = 1;
+});


🧹 Nitpick | 🔵 Trivial

Solid, deterministic perf runner; consider small improvements

Add optional diagnostics to aid failures (console, page errors, recent requests) before evaluating metrics. Playwright 1.56 exposes page.consoleMessages(), page.pageErrors(), page.requests() you can log on failure.

Allow per-scenario throttling override (extend ScenarioConfig) to model mixed flows.

Minor type nit: MetricConfig.aggregation already covers p75/p90/p95 via Aggregation; the union is redundant.

Based on learnings

🤖 Prompt for AI Agents

In frontend/tests/perf/run-perf-budget.ts around lines 100-466, implement three small changes: (1) add optional diagnostics on failing scenarios by capturing and logging page.consoleMessages(), page.pageErrors(), and page.requests() just before evaluating metrics (or when overallPassed flips false) — collect them from the page/context and include their recent entries in console.error so failures include context; (2) support per-scenario throttling by extending the runner to prefer scenario.throttling over defaults (pass scenario.throttling into applyThrottling when executing a scenario run, and update types if needed); (3) remove the redundant union usage for metric aggregation by ensuring functions accept the Aggregation type (not Aggregation | string) — adjust aggregate() signature/usage and any type annotations so aggregation is typed as Aggregation only. Ensure no new runtime exceptions are introduced and keep logging guarded so diagnostics only run when available.

coderabbitai · 2025-10-16T18:40:17Z

scripts/ci/check_canary_metrics.py

+    return CanaryMetrics(
+        latency_p95_ms=float(latency) if latency is not None else 0.0,
+        error_rate=float(error_rate) if error_rate is not None else 0.0,
+        trace_latency_p95_ms=trace_latency,
+        previous_latency_p95_ms=_float_or_none(previous_latency),
+        previous_error_rate=_float_or_none(previous_error),
+        build=build_tag,
+        previous_build=previous_build,
+        generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
+    )


⚠️ Potential issue | 🔴 Critical

Missing metrics silently pass the check

When Prometheus queries fail or return no data, latency/error_rate stay None, we coerce them to 0.0, and _evaluate happily passes the run. That masks real outages/regressions whenever the scraper is down or a query is misconfigured. Please surface this as a failure (raise, or at least refuse to build a metrics object) instead of defaulting to zeroes so CI actually fails when telemetry is missing.

Apply this change:

- return CanaryMetrics( - latency_p95_ms=float(latency) if latency is not None else 0.0, - error_rate=float(error_rate) if error_rate is not None else 0.0, + if latency is None or error_rate is None: + raise CanaryCheckError("Prometheus queries returned no data for latency/error rate") + + return CanaryMetrics( + latency_p95_ms=float(latency), + error_rate=float(error_rate), trace_latency_p95_ms=trace_latency, previous_latency_p95_ms=_float_or_none(previous_latency), previous_error_rate=_float_or_none(previous_error), build=build_tag, previous_build=previous_build, generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

return CanaryMetrics(

latency_p95_ms=float(latency) if latency is not None else 0.0,

error_rate=float(error_rate) if error_rate is not None else 0.0,

trace_latency_p95_ms=trace_latency,

previous_latency_p95_ms=_float_or_none(previous_latency),

previous_error_rate=_float_or_none(previous_error),

build=build_tag,

previous_build=previous_build,

generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),

)

if latency is None or error_rate is None:

raise CanaryCheckError("Prometheus queries returned no data for latency/error rate")

return CanaryMetrics(

latency_p95_ms=float(latency),

error_rate=float(error_rate),

trace_latency_p95_ms=trace_latency,

previous_latency_p95_ms=_float_or_none(previous_latency),

previous_error_rate=_float_or_none(previous_error),

build=build_tag,

previous_build=previous_build,

generated_at=time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),

)

scripts/ci/check_canary_metrics.py

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9aee543 and 9ba2fc4.

📒 Files selected for processing (1)

scripts/ci/check_canary_metrics.py (1 hunks)

🧰 Additional context used

🪛 Ruff (0.14.0)

scripts/ci/check_canary_metrics.py

1-1: Shebang is present but file is not executable

(EXE001)

13-13: typing.Dict is deprecated, use dict instead

(UP035)

32-32: Dynamically typed expressions (typing.Any) are disallowed in value

(ANN401)

41-41: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)

74-74: Unnecessary mode argument

Remove mode argument

(UP015)

160-160: Boolean-typed positional argument in function definition

(FBT001)

🔇 Additional comments (4)

scripts/ci/check_canary_metrics.py (4)

16-30: LGTM!

The data structures are well-defined and appropriate for the canary metrics validation use case.

32-71: LGTM!

The helper functions properly handle error cases and return None on failures, which enables the fallback logic. The URL open security warning (S310) is low risk here since URLs come from CI-controlled environment variables.

92-148: LGTM! Previous critical issue resolved.

The function now properly returns None when Prometheus queries fail (lines 105-111), which triggers the fixture fallback in _load_metrics. This addresses the critical issue from previous reviews where missing data would be coerced to 0.0 and incorrectly pass validation.

178-245: LGTM! Previous formatting issue resolved.

The evaluation logic correctly checks metrics against budgets and regression thresholds. The comparison printing (lines 234-242) now properly checks each optional value independently before formatting, resolving the TypeError issue from previous reviews.

coderabbitai · 2025-10-16T19:11:21Z

scripts/ci/check_canary_metrics.py

@@ -0,0 +1,248 @@
+#!/usr/bin/env python3


🧹 Nitpick | 🔵 Trivial

Make the script executable.

The shebang is present but the file lacks execute permissions. Run chmod +x scripts/ci/check_canary_metrics.py to make it executable, which is standard practice for scripts with shebangs.

🧰 Tools

🪛 Ruff (0.14.0)

1-1: Shebang is present but file is not executable

(EXE001)

🤖 Prompt for AI Agents

In scripts/ci/check_canary_metrics.py at line 1 the file has a shebang but is not executable; fix by setting the execute bit on the file (e.g. run chmod +x scripts/ci/check_canary_metrics.py) so it can be executed directly in CI and locally.

coderabbitai · 2025-10-16T19:11:21Z

scripts/ci/check_canary_metrics.py

+import urllib.parse
+import urllib.request
+from dataclasses import dataclass
+from typing import Any, Dict, Optional


🧹 Nitpick | 🔵 Trivial

Use modern type hint syntax.

typing.Dict is deprecated in Python 3.9+. Use the built-in dict type instead.

Apply this diff:

-from typing import Any, Dict, Optional +from typing import Any, Optional

Then update line 39:

-def _http_get_json(url: str, params: Optional[Dict[str, str]] = None) -> Dict[str, Any]: +def _http_get_json(url: str, params: Optional[dict[str, str]] = None) -> dict[str, Any]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from typing import Any, Dict, Optional

from typing import Any, Optional

def _http_get_json(url: str, params: Optional[dict[str, str]] = None) -> dict[str, Any]:

🧰 Tools

🪛 Ruff (0.14.0)

13-13: typing.Dict is deprecated, use dict instead

(UP035)

🤖 Prompt for AI Agents

In scripts/ci/check_canary_metrics.py around line 13, replace the deprecated typing.Dict import by removing Dict from the typing import (e.g., from typing import Any, Optional). Then on line 39 update any type annotation using Dict[...] to use the built-in generic syntax dict[...] (for example Dict[str, Any] -> dict[str, Any]) ensuring you keep Any/Optional from typing as needed.

scripts/ci/check_canary_metrics.py

coderabbitai · 2025-10-16T19:11:22Z

scripts/ci/check_canary_metrics.py

+def _load_metrics() -> CanaryMetrics:
+    metrics = _collect_metrics_from_services()
+    if metrics:
+        return metrics
+
+    fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json")
+    print(f"Using canary metrics fixture: {fixture_path}")
+    return _load_fixture(fixture_path)
+


⚠️ Potential issue | 🟠 Major

Add error handling for missing fixture file.

If the fixture file doesn't exist, _load_fixture raises FileNotFoundError with an unclear message. Add a try-except to provide a helpful error message indicating both the service connection failure and the missing fixture.

Apply this diff:

def _load_metrics() -> CanaryMetrics: metrics = _collect_metrics_from_services() if metrics: return metrics fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json") print(f"Using canary metrics fixture: {fixture_path}") - return _load_fixture(fixture_path) + try: + return _load_fixture(fixture_path) + except FileNotFoundError: + raise CanaryCheckError( + f"Cannot load metrics: Prometheus queries failed and fixture not found at {fixture_path}" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _load_metrics() -> CanaryMetrics:

metrics = _collect_metrics_from_services()

if metrics:

return metrics

fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json")

print(f"Using canary metrics fixture: {fixture_path}")

return _load_fixture(fixture_path)

def _load_metrics() -> CanaryMetrics:

metrics = _collect_metrics_from_services()

if metrics:

return metrics

fixture_path = os.environ.get("CANARY_METRICS_FIXTURE", "tests/perf/canary-metrics.fixture.json")

print(f"Using canary metrics fixture: {fixture_path}")

try:

return _load_fixture(fixture_path)

except FileNotFoundError:

raise CanaryCheckError(

f"Cannot load metrics: Prometheus queries failed and fixture not found at {fixture_path}"

)

🤖 Prompt for AI Agents

In scripts/ci/check_canary_metrics.py around lines 150 to 158, _load_metrics falls back to loading a fixture but doesn't handle the case where the fixture file is missing; wrap the call to _load_fixture(fixture_path) in a try/except that catches FileNotFoundError and re-raises a clearer error (or exit) that states the service metrics collection failed and the fixture file at fixture_path was not found, including the fixture_path in the message and an actionable hint to create or point CANARY_METRICS_FIXTURE to a valid file.

coderabbitai · 2025-10-16T19:11:22Z

scripts/ci/check_canary_metrics.py

+def _write_summary(metrics: CanaryMetrics, passed: bool) -> None:
+    summary_path = os.path.join("perf-results", "canary-metrics-summary.json")
+    os.makedirs(os.path.dirname(summary_path), exist_ok=True)
+    payload = {
+        "latency_p95_ms": metrics.latency_p95_ms,
+        "error_rate": metrics.error_rate,
+        "trace_latency_p95_ms": metrics.trace_latency_p95_ms,
+        "previous_latency_p95_ms": metrics.previous_latency_p95_ms,
+        "previous_error_rate": metrics.previous_error_rate,
+        "build": metrics.build,
+        "previous_build": metrics.previous_build,
+        "generated_at": metrics.generated_at,
+        "passed": passed,
+    }
+    with open(summary_path, "w", encoding="utf-8") as handle:
+        json.dump(payload, handle, indent=2)
+


🧹 Nitpick | 🔵 Trivial

Consider using a keyword argument for the boolean parameter.

The passed parameter is a boolean positional argument, which can reduce readability at call sites. Consider making it keyword-only or using an enum for clarity.

Apply this diff to make it keyword-only:

-def _write_summary(metrics: CanaryMetrics, passed: bool) -> None: +def _write_summary(metrics: CanaryMetrics, *, passed: bool) -> None:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _write_summary(metrics: CanaryMetrics, passed: bool) -> None:

summary_path = os.path.join("perf-results", "canary-metrics-summary.json")

os.makedirs(os.path.dirname(summary_path), exist_ok=True)

payload = {

"latency_p95_ms": metrics.latency_p95_ms,

"error_rate": metrics.error_rate,

"trace_latency_p95_ms": metrics.trace_latency_p95_ms,

"previous_latency_p95_ms": metrics.previous_latency_p95_ms,

"previous_error_rate": metrics.previous_error_rate,

"build": metrics.build,

"previous_build": metrics.previous_build,

"generated_at": metrics.generated_at,

"passed": passed,

}

with open(summary_path, "w", encoding="utf-8") as handle:

json.dump(payload, handle, indent=2)

def _write_summary(metrics: CanaryMetrics, *, passed: bool) -> None:

summary_path = os.path.join("perf-results", "canary-metrics-summary.json")

os.makedirs(os.path.dirname(summary_path), exist_ok=True)

payload = {

"latency_p95_ms": metrics.latency_p95_ms,

"error_rate": metrics.error_rate,

"trace_latency_p95_ms": metrics.trace_latency_p95_ms,

"previous_latency_p95_ms": metrics.previous_latency_p95_ms,

"previous_error_rate": metrics.previous_error_rate,

"build": metrics.build,

"previous_build": metrics.previous_build,

"generated_at": metrics.generated_at,

"passed": passed,

}

with open(summary_path, "w", encoding="utf-8") as handle:

json.dump(payload, handle, indent=2)

🧰 Tools

🪛 Ruff (0.14.0)

160-160: Boolean-typed positional argument in function definition

(FBT001)

🤖 Prompt for AI Agents

In scripts/ci/check_canary_metrics.py around lines 160 to 176, the boolean parameter passed is currently positional which harms readability at call sites; change the function signature to make passed a keyword-only argument (e.g., def _write_summary(metrics: CanaryMetrics, *, passed: bool) -> None) and update every call site to call _write_summary(..., passed=some_bool). Ensure imports/types remain valid and run tests/lint to catch any missed call sites.

coderabbitai · 2025-10-18T18:53:43Z

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #282

@shayancoin

…olds` Docstrings generation was requested by @shayancoin. * #123 (comment) The following files were modified: * `frontend/tests/perf/run-perf-budget.ts` * `scripts/ci/check_canary_metrics.py`

@shayancoin

…olds` (#282) Docstrings generation was requested by @shayancoin. * #123 (comment) The following files were modified: * `frontend/tests/perf/run-perf-budget.ts` * `scripts/ci/check_canary_metrics.py` Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Add CI performance budgets and canary metrics checks

9aee543

shayancoin added the codex label Oct 16, 2025 — with ChatGPT Codex Connector

chatgpt-codex-connector bot reviewed Oct 16, 2025

View reviewed changes

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

Handle missing Prometheus data and guard previous error logging

9ba2fc4

coderabbitai bot reviewed Oct 16, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Oct 18, 2025

📝 Add docstrings to codex/extend-ci-pipeline-with-performance-thresholds #282

Merged

shayancoin closed this Oct 20, 2025

-from typing import Any, Dict, Optional
+from typing import Any, Optional
+def _http_get_json(url: str, params: Optional[dict[str, str]] = None) -> dict[str, Any]:

Conversation

shayancoin commented Oct 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shayancoin commented Oct 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 16, 2025 •

edited

Loading

coderabbitai bot commented Oct 18, 2025 •

edited

Loading