Skip to content

Add Playwright perf budgets and canary telemetry gating#126

Merged
shayancoin merged 3 commits intomainfrom
codex/extend-.perf-budget.yml-with-navigation-thresholds
Oct 18, 2025
Merged

Add Playwright perf budgets and canary telemetry gating#126
shayancoin merged 3 commits intomainfrom
codex/extend-.perf-budget.yml-with-navigation-thresholds

Conversation

@shayancoin
Copy link
Owner

@shayancoin shayancoin commented Oct 16, 2025

Summary

  • add Playwright-based performance budget coverage for the configurator and homepage navigation, exporting metrics as JUnit artifacts
  • extend the perf-light workflow with a new Playwright budget job and canary Prometheus/Tempo regression checks
  • document the release checklist that ties CI metrics, Grafana alerts, and on-call sign-off together

Testing

  • npm run lint

https://chatgpt.com/codex/tasks/task_e_68f1362524008330adf8b11006937151

Summary by CodeRabbit

  • New Features

    • Added automated performance budgeting and canary metrics monitoring system with workflow validation.
  • Documentation

    • Added Release Checklist guide to documentation.
  • Chores

    • Added performance testing dependencies and internal metrics conversion tools.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 16, 2025

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

This pull request introduces comprehensive performance budgeting and monitoring infrastructure. It adds GitHub Actions workflow jobs for Playwright-based performance testing and canary metrics validation, a new performance test suite, Python and Node.js tooling scripts for metrics processing and budget checking, a v2 performance budget configuration, and corresponding documentation updates.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/workflows/perf-light.yml
Adds two new jobs: playwright-budgets (builds stack, runs Playwright performance tests, converts metrics to JUnit, uploads artifacts) and canary-metrics (depends on playwright-budgets and k6, compares metrics against budgets using Prometheus).
Performance Configuration
perf-budget.yml
Upgrades version from 1 to 2. Replaces pages array with scenarios-based structure. Introduces configurator-load and homepage-to-configurator scenarios with configurable steps, metrics (navigation-duration, largest-contentful-paint), and threshold definitions including percentile-based aggregations (p90, p95).
Frontend Performance Tests
frontend/tests/perf/budget.spec.ts
New Playwright test file that loads perf-budget.yml, validates scenarios, runs serial performance tests, collects metrics via performance observers, aggregates results per scenario (mean, median, min, max, percentile-based), compares against thresholds, and outputs JSON results to artifacts/perf.
Frontend Dependencies
frontend/package.json
Adds js-yaml devDependency.
Performance Tooling Scripts
tools/perf/check-canary-metrics.py
New Python script for canary monitoring. Reads environment config, queries Prometheus for P95 latency and 5xx error rate, optionally compares against previous build baseline, supports Tempo integration for trace regression checks, and outputs JUnit XML report.
Metrics Conversion
tools/perf/metrics-to-junit.mjs
New Node.js script converting JSON performance results to JUnit XML. Aggregates scenario results, flags failures with PerformanceBudgetExceeded type, includes metrics in system-out logs and properties.
Documentation
docs/index.md, mkdocs.yml
Adds "Release Checklist" link under Development Guides section (duplicate entry in mkdocs.yml).

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant PB as playwright-budgets Job
    participant CM as canary-metrics Job
    participant Prom as Prometheus
    participant Tempo as Tempo (optional)
    
    GHA->>PB: Trigger
    activate PB
    PB->>PB: Checkout, setup Node.js, install deps
    PB->>PB: Build & start stack
    PB->>PB: Run Playwright perf tests<br/>(load perf-budget.yml, measure scenarios)
    PB->>PB: Convert metrics to JUnit
    PB->>PB: Upload artifacts
    PB->>PB: Shutdown stack
    PB-->>GHA: Complete
    deactivate PB
    
    GHA->>CM: Trigger (depends on playwright-budgets)
    activate CM
    CM->>Prom: Query P95 latency & 5xx rate
    activate Prom
    Prom-->>CM: Return metrics
    deactivate Prom
    
    alt Tempo enabled
    CM->>Tempo: Optional trace regression check
    activate Tempo
    Tempo-->>CM: Trace results
    deactivate Tempo
    end
    
    CM->>CM: Compare canary metrics vs budgets
    CM->>CM: Generate JUnit report
    CM-->>GHA: Complete (pass/fail)
    deactivate CM
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The PR introduces multiple new components across different languages and technologies: a comprehensive Playwright performance test suite with metric collection and aggregation logic, two standalone utility scripts (Python for canary monitoring and Node.js for metrics conversion), workflow orchestration, and configuration. While individual components are relatively focused, the breadth of new code, integration points (Prometheus, Tempo, JUnit), and performance-critical logic require careful review of metric computation, threshold comparisons, and CI/CD flow correctness.

Possibly related PRs

Poem

🐰 Metrics hop through our CI, bright and keen,
Playwright budgets measure what's unseen,
Prometheus whispers performance trends,
Canary songs show where speed descends,
Performance checked from tail to ear,
Our stack runs swift throughout the year! 🚀

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description fails to follow the required template since it lacks the mandatory "PR Type" and "Short Description" sections and does not explicitly list tests under a "Tests Added" heading. Although a summary and a testing command are provided, the structure does not align with the repository’s description template. As a result, essential information is missing or misaligned with expectations. Please update the description to include the "PR Type", provide a concise "Short Description", and add a "Tests Added" section detailing any new or modified tests in accordance with the repository template.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title clearly summarizes the primary change by stating the addition of Playwright performance budgets and canary telemetry gating, which directly reflects the main features introduced in the workflow and scripts. It is concise, specific to the changeset, and avoids vague or generic language. This phrasing will help reviewers understand the core enhancement at a glance.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31486a9 and 238a3d4.

📒 Files selected for processing (3)
  • frontend/package.json (1 hunks)
  • mkdocs.yml (1 hunks)
  • perf-budget.yml (1 hunks)

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines +121 to +125
if PREVIOUS_BUILD and previous_value is not None:
message += f', previous {previous_value:.3f}{unit}'
allowed_regression = previous_value * (1 + regression_pct)
if current_value > allowed_regression:
passed = False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Skip regression comparison when previous value is zero

Regression limits are computed as previous_value * (1 + regression_pct). When the baseline value is 0—common for error rates—the allowed regression remains 0, so any non‑zero current value fails even if it is well below the absolute threshold (e.g. 0.1% errors vs a 1% limit). This will cause the workflow to fail whenever a metric moves from exactly zero to a small value. Guard the regression check when the previous value is near zero (or use the absolute threshold in that case) to avoid false failures.

Useful? React with 👍 / 👎.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2a35c4 and 4bb3922.

⛔ Files ignored due to path filters (1)
  • frontend/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • .github/workflows/perf-light.yml (2 hunks)
  • docs/index.md (1 hunks)
  • docs/release-checklist.md (1 hunks)
  • frontend/package.json (3 hunks)
  • frontend/tests/perf/budget.spec.ts (1 hunks)
  • mkdocs.yml (1 hunks)
  • perf-budget.yml (2 hunks)
  • tools/perf/check-canary-metrics.py (1 hunks)
  • tools/perf/metrics-to-junit.mjs (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
perf-budget.yml (2)
frontend/src/app/components/configurator-panel.tsx (1)
  • ConfiguratorPanel (7-65)
frontend/src/app/configurator/page.tsx (1)
  • ConfiguratorPage (5-18)
tools/perf/check-canary-metrics.py (2)
tests/perf/k6-quote-cnc.js (1)
  • file (107-107)
tools/perf/metrics-to-junit.mjs (3)
  • name (49-49)
  • base (62-62)
  • xml (92-92)
🪛 actionlint (1.7.8)
.github/workflows/perf-light.yml

52-52: could not parse as YAML: could not find expected ':'

(syntax-check)

🪛 GitHub Actions: CI
tools/perf/check-canary-metrics.py

[error] 122-123: Type-check: Argument type mismatch for query_prometheus; expected str, got str | None.

.github/workflows/perf-light.yml

[error] 52-53: check-yaml: Could not parse YAML file due to missing ':' in perf-light.yml.

🪛 LanguageTool
docs/release-checklist.md

[grammar] ~1-~1: Use correct spacing
Context: # Release Checklist The release process ties together CI per...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~3-~3: Use correct spacing
Context: ...branch or promote a build to production. ## 1. Verify CI performance gates 1. Confi...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~5-~5: There might be a mistake here.
Context: ...tion. ## 1. Verify CI performance gates 1. Confirm the perf-light workflow is gre...

(QB_NEW_EN_OTHER)


[grammar] ~7-~7: There might be a mistake here.
Context: ...t` workflow is green. This workflow now publishes: - Playwright-based performance budgets sou...

(QB_NEW_EN_OTHER)


[grammar] ~8-~8: There might be a mistake here.
Context: ...budgets sourced from .perf-budget.yml. - k6 smoke test summaries for the configur...

(QB_NEW_EN)


[grammar] ~11-~11: There might be a mistake here.
Context: ...rometheus/Tempo. 2. Review the uploaded artifacts: - perf-budgets/perf-metrics.junit.xml for build-time performance metrics. -...

(QB_NEW_EN_OTHER)


[grammar] ~12-~12: There might be a problem here.
Context: ...xmlfor build-time performance metrics. -perf-budgets/frontend/playwright-report/results.xmlfor Playwright run details. -canary...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~13-~13: There might be a problem here.
Context: ...results.xmlfor Playwright run details. -canary-metrics/canary-metrics.junit.xml` for Prometheus/Tempo comparisons. 3. Ens...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~14-~14: There might be a mistake here.
Context: ...t.xml` for Prometheus/Tempo comparisons. 3. Ensure no performance regression comment...

(QB_NEW_EN)


[grammar] ~15-~15: Use correct spacing
Context: ...ilures immediately; they block releases. ## 2. Check Grafana and alerting state 1. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~17-~17: There might be a mistake here.
Context: ... ## 2. Check Grafana and alerting state 1. Open the "Configurator Release" Grafana ...

(QB_NEW_EN_OTHER)


[grammar] ~20-~20: There might be a mistake here.
Context: ...ledge or resolve any active alerts tied to: - P95 homepage → configurator navigation l...

(QB_NEW_EN_OTHER)


[grammar] ~21-~21: There might be a mistake here.
Context: ... navigation latency breaching 3 seconds. - Configurator LCP exceeding 4 seconds. ...

(QB_NEW_EN)


[grammar] ~24-~24: Use correct spacing
Context: ...t acknowledgements in the release notes. ## 3. Validate preview environments 1. Ens...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~26-~26: There might be a mistake here.
Context: ...es. ## 3. Validate preview environments 1. Ensure preview environments (staging/QA)...

(QB_NEW_EN_OTHER)


[grammar] ~31-~31: Use correct spacing
Context: ...ember signs off on the remediation plan. ## 4. Final release sign-off 1. Update the...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~33-~33: Use correct spacing
Context: ...tion plan. ## 4. Final release sign-off 1. Update the release ticket with links to:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~40-~40: Use correct spacing
Context: ...n-call leads before promoting the build. Keeping performance, telemetry, and on-c...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🪛 Ruff (0.14.0)
tools/perf/check-canary-metrics.py

62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


65-65: Avoid specifying long messages outside the exception class

(TRY003)


68-68: Avoid specifying long messages outside the exception class

(TRY003)


76-76: Avoid specifying long messages outside the exception class

(TRY003)


116-116: Boolean positional value in function call

(FBT003)


178-178: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


181-181: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


181-181: Avoid specifying long messages outside the exception class

(TRY003)

🪛 YAMLlint (1.37.1)
.github/workflows/perf-light.yml

[error] 53-53: syntax error: could not find expected ':'

(syntax)

Comment on lines +51 to +91
python - <<'PY'
import json
import os
import urllib.error
import urllib.request

BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")

def post(path: str, payload: dict) -> dict:
req = urllib.request.Request(
f"{BASE_URL}{path}",
data=json.dumps(payload).encode("utf-8"),
headers={"Content-Type": "application/json"},
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", "ignore")
raise SystemExit(f"Seed request failed ({exc.code}): {detail}")

material = post(
"/api/materials/",
{"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
)
material_id = material.get("id")
if not material_id:
raise SystemExit("Material creation failed; missing id")

post(
"/api/modules/",
{
"name": "Base600",
"width": 600.0,
"height": 720.0,
"depth": 580.0,
"base_price": 100.0,
"material_id": material_id,
},
)
PY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix heredoc indentation: workflow YAML is currently invalid

The new inline Python block is flush-left, so the YAML scalar breaks parsing (actionlint/check-yaml fail on Line 52: “could not find expected ':'”). CI jobs cannot run until the block contents are indented at least as much as the python - <<'PY' line.

-          python - <<'PY'
-import json
-import os
-import urllib.error
-import urllib.request
-
-BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
-
-def post(path: str, payload: dict) -> dict:
-    req = urllib.request.Request(
-        f"{BASE_URL}{path}",
-        data=json.dumps(payload).encode("utf-8"),
-        headers={"Content-Type": "application/json"},
-    )
-    try:
-        with urllib.request.urlopen(req, timeout=10) as resp:
-            return json.loads(resp.read().decode("utf-8"))
-    except urllib.error.HTTPError as exc:
-        detail = exc.read().decode("utf-8", "ignore")
-        raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
-
-material = post(
-    "/api/materials/",
-    {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
-)
-material_id = material.get("id")
-if not material_id:
-    raise SystemExit("Material creation failed; missing id")
-
-post(
-    "/api/modules/",
-    {
-        "name": "Base600",
-        "width": 600.0,
-        "height": 720.0,
-        "depth": 580.0,
-        "base_price": 100.0,
-        "material_id": material_id,
-    },
-)
-PY
+          python - <<'PY'
+          import json
+          import os
+          import urllib.error
+          import urllib.request
+
+          BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
+
+          def post(path: str, payload: dict) -> dict:
+              req = urllib.request.Request(
+                  f"{BASE_URL}{path}",
+                  data=json.dumps(payload).encode("utf-8"),
+                  headers={"Content-Type": "application/json"},
+              )
+              try:
+                  with urllib.request.urlopen(req, timeout=10) as resp:
+                      return json.loads(resp.read().decode("utf-8"))
+              except urllib.error.HTTPError as exc:
+                  detail = exc.read().decode("utf-8", "ignore")
+                  raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
+
+          material = post(
+              "/api/materials/",
+              {"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
+          )
+          material_id = material.get("id")
+          if not material_id:
+              raise SystemExit("Material creation failed; missing id")
+
+          post(
+              "/api/modules/",
+              {
+                  "name": "Base600",
+                  "width": 600.0,
+                  "height": 720.0,
+                  "depth": 580.0,
+                  "base_price": 100.0,
+                  "material_id": material_id,
+              },
+          )
+          PY

This preserves the zero-indent Python source once YAML strips the common leading spaces, while restoring a valid workflow file.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
python - <<'PY'
import json
import os
import urllib.error
import urllib.request
BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
def post(path: str, payload: dict) -> dict:
req = urllib.request.Request(
f"{BASE_URL}{path}",
data=json.dumps(payload).encode("utf-8"),
headers={"Content-Type": "application/json"},
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", "ignore")
raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
material = post(
"/api/materials/",
{"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
)
material_id = material.get("id")
if not material_id:
raise SystemExit("Material creation failed; missing id")
post(
"/api/modules/",
{
"name": "Base600",
"width": 600.0,
"height": 720.0,
"depth": 580.0,
"base_price": 100.0,
"material_id": material_id,
},
)
PY
python - <<'PY'
import json
import os
import urllib.error
import urllib.request
BASE_URL = os.environ.get("BASE_URL", "http://localhost:8000")
def post(path: str, payload: dict) -> dict:
req = urllib.request.Request(
f"{BASE_URL}{path}",
data=json.dumps(payload).encode("utf-8"),
headers={"Content-Type": "application/json"},
)
try:
with urllib.request.urlopen(req, timeout=10) as resp:
return json.loads(resp.read().decode("utf-8"))
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", "ignore")
raise SystemExit(f"Seed request failed ({exc.code}): {detail}")
material = post(
"/api/materials/",
{"name": "Walnut", "texture_url": None, "cost_per_sq_ft": 12.5},
)
material_id = material.get("id")
if not material_id:
raise SystemExit("Material creation failed; missing id")
post(
"/api/modules/",
{
"name": "Base600",
"width": 600.0,
"height": 720.0,
"depth": 580.0,
"base_price": 100.0,
"material_id": material_id,
},
)
PY
🧰 Tools
🪛 actionlint (1.7.8)

52-52: could not parse as YAML: could not find expected ':'

(syntax-check)

🪛 GitHub Actions: CI

[error] 52-53: check-yaml: Could not parse YAML file due to missing ':' in perf-light.yml.

🪛 YAMLlint (1.37.1)

[error] 53-53: syntax error: could not find expected ':'

(syntax)

🤖 Prompt for AI Agents
.github/workflows/perf-light.yml around lines 51 to 91: the inline heredoc
python block is not indented to match the "python - <<'PY'" line which breaks
YAML parsing; indent every line of the embedded Python (including the closing PY
delimiter) by the same amount of whitespace as the "python - <<'PY'" line so the
YAML scalar is valid and the Python source retains its zero-relative indentation
after YAML unindentation.

Comment on lines +108 to +112
current_value = query_prometheus(PROMETHEUS_URL, current_query)
previous_value = None
if PREVIOUS_BUILD:
previous_query = render_query(template, PREVIOUS_BUILD)
previous_value = query_prometheus(PROMETHEUS_URL, previous_query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make PROMETHEUS_URL non-optional before calling query_prometheus.

PROMETHEUS_URL is declared as an Optional[str], so the type-checker (and Ruff, per the pipeline) sees a str | None argument flowing into query_prometheus. Even though the early guard at Line 48 exits when the env var is missing, the type system cannot infer that, causing the CI failure. Force the variable to str (e.g., via typing.cast, or by rebinding after the guard) before passing it to query_prometheus.

🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 108 to 112, PROMETHEUS_URL is
typed Optional[str] but is passed directly to query_prometheus causing
type-check failures; rebind or cast PROMETHEUS_URL to str after the early guard
(e.g., url: str = PROMETHEUS_URL or use typing.cast(str, PROMETHEUS_URL)) so the
variable passed into query_prometheus is a non-optional str, then use that url
variable in the current_query and previous_query calls.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
tools/perf/check-canary-metrics.py (1)

108-112: Make PROMETHEUS_URL non-optional before calling query_prometheus.

This is the same type incompatibility issue flagged in the pipeline and previous review. PROMETHEUS_URL is declared as Optional[str], so passing it to query_prometheus (which expects str) causes type-checker failures. Although the early guard at line 48 ensures it's non-None, the type system cannot infer this.

Force PROMETHEUS_URL to str after the guard by rebinding or using typing.cast:

+from typing import Optional, cast
+
 # ... after line 49, add:
+PROMETHEUS_URL = cast(str, PROMETHEUS_URL)
 
 # ... then lines 108-112 work correctly

Or rebind after the guard:

 if not PROMETHEUS_URL:
     exit_success('PROMETHEUS_URL not provided; skipping canary metric comparison.')
+else:
+    PROMETHEUS_URL_STR: str = PROMETHEUS_URL
 
 # ... then use PROMETHEUS_URL_STR in evaluate_metric
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4bb3922 and 31486a9.

📒 Files selected for processing (1)
  • tools/perf/check-canary-metrics.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tools/perf/check-canary-metrics.py (1)
tools/perf/metrics-to-junit.mjs (3)
  • name (49-49)
  • base (62-62)
  • xml (92-92)
🪛 GitHub Actions: CI
tools/perf/check-canary-metrics.py

[error] 122-123: Type hints: incompatible types for query_prometheus/render_query arguments.

🪛 Ruff (0.14.0)
tools/perf/check-canary-metrics.py

62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


65-65: Avoid specifying long messages outside the exception class

(TRY003)


68-68: Avoid specifying long messages outside the exception class

(TRY003)


76-76: Avoid specifying long messages outside the exception class

(TRY003)


116-116: Boolean positional value in function call

(FBT003)


182-182: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


185-185: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


185-185: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (2)
tools/perf/check-canary-metrics.py (2)

123-130: LGTM! Near-zero baseline handling prevents false positives.

The logic correctly addresses the concern raised in the previous review about regression checks failing when the baseline is zero. By checking if previous_value is near zero (line 123) and skipping the regression check in that case (line 125), you avoid false failures when a metric moves from exactly zero to a small non-zero value.

Based on the previous review comment about zero baseline regression.


239-288: LGTM! JUnit XML generation is correctly structured.

The JUnit XML generation properly follows the JUnit schema with:

  • Correct XML escaping (lines 244-251)
  • Appropriate structure with testsuites/testsuite/testcase hierarchy
  • Failure messages for budget violations
  • System output with metric details in CDATA
  • Consistent classname ("canary.metrics") matching the companion script's approach

The structure aligns with the expected format shown in the relevant code snippets from metrics-to-junit.mjs.

Comment on lines +27 to +35
P95_QUERY_TEMPLATE = os.getenv(
'CANARY_P95_QUERY',
'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000',
)
ERROR_QUERY_TEMPLATE = os.getenv(
'CANARY_ERROR_RATE_QUERY',
'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) '
'/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))',
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider using f-strings or .format() for query template construction.

The current approach of concatenating strings with + in the default values for P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE works but reduces readability. Using f-strings or .format() would make the template structure clearer.

For example:

 P95_QUERY_TEMPLATE = os.getenv(
     'CANARY_P95_QUERY',
-    'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000',
+    f'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m])) by (le)) * 1000',
 )
 ERROR_QUERY_TEMPLATE = os.getenv(
     'CANARY_ERROR_RATE_QUERY',
-    'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) '
-    '/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))',
+    f'sum(rate(http_requests_total{{service="{CANARY_SERVICE}",status=~"5..",deployment="canary",build="$BUILD"}}[5m])) '
+    f'/ sum(rate(http_requests_total{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m]))',
 )

Note: Double braces {{ and }} are used to escape literal braces in f-strings.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
P95_QUERY_TEMPLATE = os.getenv(
'CANARY_P95_QUERY',
'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m])) by (le)) * 1000',
)
ERROR_QUERY_TEMPLATE = os.getenv(
'CANARY_ERROR_RATE_QUERY',
'sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",status=~"5..",deployment="canary",build="$BUILD"}[5m])) '
'/ sum(rate(http_requests_total{service="' + CANARY_SERVICE + '",deployment="canary",build="$BUILD"}[5m]))',
)
P95_QUERY_TEMPLATE = os.getenv(
'CANARY_P95_QUERY',
f'histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m])) by (le)) * 1000',
)
ERROR_QUERY_TEMPLATE = os.getenv(
'CANARY_ERROR_RATE_QUERY',
f'sum(rate(http_requests_total{{service="{CANARY_SERVICE}",status=~"5..",deployment="canary",build="$BUILD"}}[5m])) '
f'/ sum(rate(http_requests_total{{service="{CANARY_SERVICE}",deployment="canary",build="$BUILD"}}[5m]))',
)
🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 27 to 35, the default
Prometheus query strings are built by concatenating pieces with '+' which is
hard to read; replace these concatenations with f-strings or str.format() to
construct P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE more clearly (remember to
escape literal braces as '{{' and '}}' when using f-strings) and keep the
environment variable fallback behavior the same so the default query text
remains identical to current value; apply the same change to both
P95_QUERY_TEMPLATE and ERROR_QUERY_TEMPLATE.

Comment on lines +58 to +77
def query_prometheus(url: str, query: str) -> Optional[float]:
encoded_query = urllib.parse.urlencode({'query': query})
endpoint = f"{url.rstrip('/')}/api/v1/query?{encoded_query}"
try:
with urllib.request.urlopen(endpoint, timeout=15) as response:
payload = json.loads(response.read().decode('utf-8'))
except urllib.error.URLError as exc: # pragma: no cover - network failure
raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}')

if payload.get('status') != 'success':
raise RuntimeError(f'Prometheus query {query} failed: {payload}')

results = payload.get('data', {}).get('result', [])
if not results:
return None
try:
return float(results[0]['value'][1])
except (KeyError, IndexError, ValueError) as exc: # pragma: no cover - unexpected payload
raise RuntimeError(f'Unexpected response format for query {query}: {payload}') from exc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate URL schemes to prevent SSRF or file:// access.

urllib.request.urlopen accepts arbitrary schemes including file:// and custom protocols. If PROMETHEUS_URL or TEMPO_URL could be set to untrusted values, this creates a security risk. Consider validating that the URL starts with http:// or https:// before opening.

Apply this diff to add URL scheme validation:

 def query_prometheus(url: str, query: str) -> Optional[float]:
+    if not url.startswith(('http://', 'https://')):
+        raise ValueError(f'Invalid URL scheme: {url}')
     encoded_query = urllib.parse.urlencode({'query': query})
     endpoint = f"{url.rstrip('/')}/api/v1/query?{encoded_query}"
     try:
         with urllib.request.urlopen(endpoint, timeout=15) as response:
             payload = json.loads(response.read().decode('utf-8'))
-    except urllib.error.URLError as exc:  # pragma: no cover - network failure
-        raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}')
+    except urllib.error.URLError as exc:  # pragma: no cover - network failure
+        raise RuntimeError(f'Failed to query Prometheus at {endpoint}: {exc}') from exc

Additionally, apply the same validation in maybe_check_tempo at line 180:

def maybe_check_tempo() -> Optional[MetricResult]:
    if not TEMPO_URL:
        print('TEMPO_URL not provided; skipping trace regression checks.')
        return None

    if not TEMPO_URL.startswith(('http://', 'https://')):
        raise ValueError(f'Invalid URL scheme: {TEMPO_URL}')
    
    # ... rest of function

Note: The diff also adds from exc to improve exception chaining (addresses Ruff B904).

🧰 Tools
🪛 Ruff (0.14.0)

62-62: Audit URL open for permitted schemes. Allowing use of file: or custom schemes is often unexpected.

(S310)


65-65: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


65-65: Avoid specifying long messages outside the exception class

(TRY003)


68-68: Avoid specifying long messages outside the exception class

(TRY003)


76-76: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In tools/perf/check-canary-metrics.py around lines 58 to 77, validate the URL
scheme before calling urllib.request.urlopen to prevent SSRF/file:// access:
check that the provided url starts with "http://" or "https://", and if not
raise a ValueError with the offending URL; also propagate the original exception
when raising the RuntimeError for unexpected payload by using "from exc" for
proper exception chaining. Additionally, apply the same http/https scheme
validation in maybe_check_tempo at ~line 180 (raise ValueError if TEMPO_URL does
not start with "http://" or "https://").

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 18, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #285

coderabbitai bot added a commit that referenced this pull request Oct 18, 2025
…resholds`

Docstrings generation was requested by @shayancoin.

* #126 (comment)

The following files were modified:

* `tools/perf/check-canary-metrics.py`
@shayancoin shayancoin merged commit 7d8560a into main Oct 18, 2025
3 of 8 checks passed
shayancoin pushed a commit that referenced this pull request Oct 18, 2025
…resholds` (#285)

Docstrings generation was requested by @shayancoin.

* #126 (comment)

The following files were modified:

* `tools/perf/check-canary-metrics.py`

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant